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(S7) Abstract 

Autosomal dominant polycystic kidney disease (ADPKD) is a common genetic disorder which frequently results in renal failure, due 
to progressive cyst development. The major locus, PKDl, maps to I6pl3.3. A chromosome translocation is identified a^oaated with 
ADPKD which disrupts a gene (PBP), encoding a 14 kb transcript, in the PKDl candidate region. Further muUtions of the PBP gene 
were found in PKDl patients confirming that PBP is the PKDl gene. This gene is located adjaccm to the mbcn)us sclerosis (2) locus in 
a genomic region that is reiterated more proximally on I6p. The dupUcate area encodes three transcripts substantially homologous to the 
PKDl transcript. Partial sequence analysU of the PKDl transcript shows that it encodes a novel protein. Screening of actual or suspected 
ADPKD patients for nonnal or mutated PKDl can be med for diagnostic purposes. PKDl-associated disorders such as ADPKD may be 
treated or prevented by PKDl gene therapy and/or administration of functional PKDl protein to affected cells. 
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POLYCYSTIC KIDNEY DISEASE 1 GENE AND USES THEREOF 

The present invention relates to the polycystic 
kidney disease 1 (PKDl) gene, mutations thereof in 
patients having PKDl-associated disorders, the protein 
encoded by the PKDl gene, and their uses in diagnosis 

5 and therapy. 

Background to the Invention 

All references mentioned herebelow are listed in 
full at the end of the description which are herein 
incorporated by reference in their entirety. Except 

10 where the context clearly indicates otherwise, 
references to the PBP gene, transcript, sequence, 
protein or the like can be read as referring to the 
PKDl gene, transcript, sequence, protein or the like, 
respectively . 

15 A landmark study by Dalgaard, 1957 showed that 

autosomal dominant polycystic kidney disease (ADPKD) 
also termed adult polycystic kidney disease (APKD) is 
one of the commonest genetic diseases of man 
(approximately 1/1000 individuals affected). The major 

20 feature of this dominant disease is the development of 
cystic kidneys which commonly leads to renal failure in 
adult life. This simple description, however, belies 
the diverse systemic disorder, affecting many other 
organs (reviewed in Gabow, 1990) and one which 

25 occasionally presents in childhood (Fink, et al., 1993; 
Zerres, et al., 1993). Extrarenal manifestations 
include liver cysts (Milutinovic , et al., 1980), and 
more rarely cysts of the pancreas (Gabow, 1993) and 
other organs. Intracranial aneurysms occur in 

30 approximately 5% of patients and are a significant 
cause of morbidity and mortality due to subarachnoid 
haemorrhage (Chapman, et al., 1992). More recently, an 
increased prevalence of cardiac valve defects (Hossack, 
et al., 1988), herniae (Gabow, 1990) and colonic 
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diverticulae (Scheff, et al,, 1980) has been reported. 

The major cause of morbidity in ADPKD, however, is 
progressive renal disease characterised by the 
formation and enlargement of fluid filled cysts, 
5 resulting in grossly enlarged kidneys. Renal function 
deteriorates as normal tissue is compromised by cystic 
growth, resulting in end stage renal disease (ESRD) in 
more than 50% of patients by the age of 60 years 
(Gabow, et al . , 1992): ADPKD accounts for 8-10% of all 
10 renal transplantation and dialysis patients in Europe 
and the USA (Gabow, 1993), Biochemical studies have 
suggested several potential causes of cyst formation 
and development, including: abnormal epithelial cell 
growth, alterations to the extracellular matrix and 
15 changes in cellular polarity and secretion (reviewed in 
Gabow, 1991; Wilson and Sherwood, 1991). The primary 
defect in ADPKD, however, remains unclear and 
considerable effort has therefore been applied to 
identifying the defective gene(s) in this disorder by 
20 genetic approaches. 

The first step towards positional cloning of an 
ADPKD gene was the demonstration of linkage of one 
locus now designated the polycystic kidney disease 1 
(PKDl) locus to the a globin ^cluster on the short arm 
25 of chromosome 16 (Reeders, et al,, 1985 ). 
Subsequently, families with ADPKD unlinked to markers, 
of 16p were described (Kimberling, et al., 1988; 
Romeo, et al . , 1988) and a second ADPKD locus (PKD2) 
has recently been assigned to chromosome region 4ql3- 
30 q23 (Kimberling, et al,, 1993; Peters, et al., 1993), 
It is estimated that approximately 85% of ADPKD is due 
to PKDl (Peters and Sandkuijl, 1992) with PKD2 
accounting for most of the remainder. PKD2 appears to 
be a milder condition with a later age of onset and 
35 ESRD (Parf rey, 'et..,al . , 1990; Gabow, et al . , 1992; 
Ravine, et al,, 1992). - 
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The position of the PKDl locus was refined to 
chroinosorae band 16pl3.3 and many markers were isolated 
from that region (Breuning, et al., 1987; Reeders, et 
al./ 1988; Breuning, et al., 1990; Gerraino, et al., 

5 1990; Hyland, et al., 1990; Himmelbauer, et al-, 
1991). Their order, and the position of the PKDl 
locus, has been determined by extensive linkage 
analysis in normal and PKDl families and by the use of 
a panel of somatic cell hybrids (Reeders, et al., 1988; 

10 Breuning, et al., 1990; Germino, et al., 1990). An 
accurate long range restriction map (Harris, et al., 
1990; Germino, et al., 1992) has located the PKDl 
locus in an interval of approximately 600 kb between 
the markers GGGl and SM7 (Harris, et al., 1991; 

15 Somlo, et al., 1992) (see Figure la). The density of 
CpG islands and identification of many mRNA transcripts 
indicated that this area is rich in gene sequences. 
Germino et al (1992) estimated that the candidate 
region contains approximately 20 genes. 

20 Identification of the PKDl gene from within this 

area has thus proved difficult and other means to 
pinpoint the disease gene were sought. Linkage 
disequilibrium has been demonstrated between PKDl and 
the proximal marker VK5 , in a Scottish population 

25 (Pound, et al., 1992) and between PKDl and BLu24 (see 
Figure la), in a Spanish population (Peral, et al., 
1994). Studies with additional markers have shown 
evidence of a common ancestor in a proportion of each 
population (Peral, et al., 1994; Snarey, et al., 

30 1994), but the association has not precisely positioned 
the PKDl locus . 

Disease associated genomic rearrangements, 
detected by cytogenetics or pulsed field gel 
electrophoresis (PFGE) have been instrumental in the 

35 identification of various genes assoricitod with various 
genetic disorders. Hitherto, no such abnormal it i.e.«s 
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related to PKDl have been described. This situation 
contrasts with that for the tuberous sclerosis locus, 
which lies within 16pl3.3 (TSC2). In that case, TSC 
associated deletions were detected by PFGE within the 
5 interval thought to contain the PKDl gene and their 
characterisation was a significant step toward the 
rapid identification of the TSC2 gene (European 
Chromosome 15 Tuberous Sclerosis Consortium, 1993), 
The TSC2 gene therefore maps within the candidate 
10 region for the hitherto unidentified PKDl gene; as 
polycystic kidneys are a feature common to TSC and 
ADPKDl (Bernstein and Robbins, 1991) the possibility of 
an aetiological link, as proposed by Kandt et al . 
(1992), was considered. 
15 We have now identified a pedigree in which the two 

distinct phenotypes, typical ADPKD or TSC, are seen in 
different merabers . In this family, the two individuals 
with ADPKD are carriers of a balanced chromosome 
translocation with a breakpoint within 16pl3.3. We 
20 have located the chromosome 16 translocation breakpoint 
and a gene disrupted by this rearrangement has been 
defined; the discovery of additional mutations of that 
gene in other PKDl patients shows that we have 
identified the PKDl gene. 
2 5 Summarv of the Invention 

■ 

Accordingly, in one aspect, this invention 
provides an isolated, purified or recombinant: nucleic 
acid sequence comprising: - 

(a) a PKDl gene or its complementary strand, 

30 (b) a sequence substantially homologous to, or 

capable of hybridising to, a substantial portion of a 
molecule defined in (a) above, 

(c) a fragment of a molecule defined in (a) or 

(b) above. In particular, there is provided a sequence 
35 wherein the PKDl gene has the partial nucleic acid 
sequence according to Figure 7 and/or 10, The 
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invention therefore includes a DNA molecule selected 
from: 

(a) a PKDl gene or its complementary strand, 

(b) a sequence substantially homologous to, or 
5 capable of hybridising to, a substantial portion of a 

molecule defined in (a) above, 

(c) a molecule coding for a polypeptide having 
the partial sequence of Figure 7, 

(d) genomic DNA corresponding to a molecule in 
10 (a) above; and 

(e) a fragment of a molecule defined in any of 
( a ) , ( b ) , C c ) or ( d ) above . 

The PKDl gene described herein is a gene found on 
human chroraosone 16, and the results of familial 

15 studies described herein form the basis for concluding 
that this PKDl gene encodes a protein called PKDl 
protein which has a role in the prevention or 
suppression of ADPKD. The PKDl gene therefore includes 
the DNA sequences shown in Figures 7 and 10, and all 

20 functional equivalents. The gene furthermore includes 
regulatory regions which control the expression of the 
PKDl coding sequence, including promotor, enhancer and 
terminator regions. Other DNA sequences such as 
introns spliced from the end-product PKDl RNA 

25 transcript are also encompassed. Although work has 
been carried out in relation to the human gene, the 
corresponding genetic and functional sequences present 
in lower animals are also encompassed. 

The present invention therefore further provides a 

30 PKDl gene or its complementary strand having the 
partial sequence according to Figure 7. In particular, 
it provides a PKDl gene or its complementary strand 
having the partial sequence of Figures 7 and/or 10 
which gene or strand is mutated in some ADPKD patients 

35 (more specifically, PKDl patients). 
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6 

The Invention further provides a nucleic acid seguenos oor^xrising a 
mutant PKDl ger^, especially one selected frcm a sequence ccnprising a 
partial sequence aooonUng to Figures 7 and/or 10 vten: 

(a) [0X114] base pairs 1746-2192 as defined in Figure 7 are deleted 
5 (446bp); 

(b) [0X32] base pairs 3696-3831 es defined in Figure 7 are deleted 
by a splicing defect; 

(c) [0X875] about 5.5kb flanked by tte two Xbal sites shown in 
Figure 3a are deleted and the EooRl site separating the CWIO (41kb) and JHl 

10 (IBkb) sites is thereby absent 

(d) [WS53] about lOOkb extendi!^ between the JHl and OfJZX and the 
SM6 and aH17 sites shown in Figure 6 and the PKDl gene is thereby absent, 
tt^ deletion lyii^ proxiinally between SM6 and JH13; 

(e) [461] 18bp are deleted in the 75bp intron airplified by the 
15 priner pair 3A3C insert at position 3696 of the 3' sequence as shown in 

Figure 11; 

C f ) [0X1054] 20bp are deleted in the 75bp intron an^ilif ied by the 
priner pair 3A3C insert at position 3696 of the 3' sequence as shown in 
Figure 11; 

20 (g) [WS212] about 75kb are deleted between SM9-CW9 distally and the 

PKDl 3'UTR projdmally as shewn in Figure 12; 

(h) [WS-215] about 160kb are deleted between CW20 and SM6-JH17 as shown 
in Figure 12; 

(i) [WS-227] about 50kb are deleted between CW20 and JHll as shown in 
25 Figure 12; 

(j) [WS-219] about 27kb are deleted between JHl and JH6 as shown in 
Figure 12; 

(k) [WS-250] about 160kb are deleted between CW20 and BLu24 as sham in 
• Figure 12; 

30 (1) [WS-194] about 65kb is deleted between CM20 and CWIO. 

Tte invention tterefore extends to RNA molecules oaiprising an PNA 
sequence corresponding to any of the DMA sequences set out above. The 
molecule is preferably ths transcript reference PBP and 

SUBSTITUTE SHEET (RULE 26) 
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identifiable from the restriction map of Figure 3a and 
having a sequence of about 14 Kb, 

In another aspect, the invention provides a 
nucleic acid probe having a sequence as set out above; 

5 in particular, this invention extends to a purified 
nucleic acid probe which hybridises to at least a 
portion of the DNA or RNA molecule of any of the 
preceding sequences. Preferably, the probe includes a 
label such as a radiolabel for example a •'^P label. 

10 In another aspect, this invention provides a 

purified DNA or RNA coding for a protein comprising the 
amino acid sequence of Figure 7 and/or 10, or a protein 
polypeptide having homologous properties with said 
protein, or having at least one functional domain or 

15 active site in common with said protein. 

The DNA molecule defined above may be incorporated 
in a recombinant cloning vector for expressing a 
protein having the amino acid sequence of Figure 7 
and/or 10, or a protein or a polypeptide having at 

20 least one functional domain or active site in common 
with said protein. 

In another aspect, the 'invention provides a 
polypeptide encoded by a sequence as set out above, or 
having the amino acid sequence according to the partial 

25 amino acid sequence of Figure 7 and/or 10, or a protein 
or polypeptide having homologous properties with said 
protein, or having at least one functional domain or 
active site in common with said protein. In 
particular, there is provided an isolated, purified or 

30 recombinant polypeptide comprising a PKDl protein or a 
mutant or variant thereof or encoded by a sequence set 
out above or a variant thereof having substantially the 
same activity as the PKDl protein. 

This invention also provides an in vitro method of 

35 determining whether an individual is likely to be 
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affected with tuberous sclerosis, comprising the steps 
of: 

assaying a sample from the individual to determine 
the presence and/or amount of PKDl protein or 
5 polypeptide having the amino acid sequence of Figure 7 
and/or 10. 

Additionally or alternatively, a sample may be 
assayed to determine the presence and/or amount of mRNA 
coding for the protein or polypeptide having the amino 

10 acid sequence of Figure 7 and/or 10, or to determine 
the fragment lengths of fragments of nucleotide 
sequences coding for the protein or polypeptide of 
Figure 7 and/or 10, or to detect inactivating mutations 
in DNA coding for a protein having the amino acid 

15 sequence of Figure 7 and/or 10 or a protein having 
homologous properties . Said screening preferably 
includes applying a nucleic acid amplification process 
to said sample to amplify a fragment of the DNA 
sequence. Said nucleic acid amplification process 

20 advantagously utilizes at least one of the following 
sets of primers as identified herein :- 

AH3 F9 : AH3 B7 
3A3 CI ; 3A3 C2 
25 AH4 F2 : JH14 B3 

Alternatively, said screening method may comprise 
digesting said sample to provide EcoRI fragments and 
hybridising with a DNA probe which hybridises to the 
30 EcoRI fragment identified (A) in Figure 3(a), and said 
DNA probe may comprise the DNA probe CWIO identified 
herein . 

Another screening method may comprise digesting 
said sample to provide BamHI fragments and hybridising 
35 with .a. DNA probe v/hich hybridises to the BamHI fragment 
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identified (B) in Figure 3 (a), and said DNA probe may 
comprise the DNA probe 1A1H.6 identified herein. 

A method according to the present invention may 
comprise detecting a PKDl-associated disorder in a 

5 patient suspected of having or having predisposition 
to, said disorder, the method comprising detecting the 
presence of and/or evaluating the characteristics of 
PKDl DNA, PKDl mRNA and/or PKDl protein in a sample 
taken from the patient. Such method may comprise 

10 detecting and/or evaluating whether the PKDl DNA is 
deleted, missing, mutated, aberrant or not expressing 
normal PKDl protein. One way of carrying out such a 
method compr i s es : 

A. taking a biological, tissue or biopsy 
15 sample from the patient; 

B. detecting the presence of and/or evaluating 
the characteristics of PKDl DNA, PKDl mRNA and/or PKDl 
protein in the sample to obtain a first set of results; 

C. comparing the first set of results with a 
20 second set of results obtained using the same or 

similar methodology for an individual not suspected of 
having said disorders; and if the first and second sets 
of results differ in that the PKDl DNA is deleted, 
missing, aberrant, mutated or not expressing PKDl 

2S protein then that indicates the presence, 
predisposition or tendency of the patient to develop 
said disorders. 

A specific method according to the invention 
comprises extracting a sample of PKDl DNA or DNA from 

30 the PKDl locus purporting to be PKDl DNA from a 
patient, cultivating the sample in vitro and analysing 
the resulting protein, and comparing the resulting 
protein with normal PKDl protein according to the well- 
established Protein Truncation Test. 

35 Less sensitive tests include analysis of RNA using 

RT PCR (reverse transcriptase polymerase chain 
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reaction) and examination of genoinic DNA, 

On the other hand, if step C of the method is 
replaced by: 

C. comparing the first set of results with a 

5 second set of results obtained using the same or 
similar methodology in an individual known to have the 
or at least one of said disorder(s); and if the first 
and second sets of results are substantially identical, 
this indicates that the PKDl DNA in the patient is 
10 deleted, mutated or not expressing normal PKDl protein. 

The invention further provides a method of 
characterising a mutation in a subject suspected of 
having a mutation in the PKoi gene, which method 
comprises : 

15 A. amplifying each of the exons in the PKDl 

gene of the subject; 

B. denaturing the complementary strands of. the 
amplified exons; 

C. diluting the denatured separate, 
20 complementary strands to allow each single-stranded DNA 

molecule to assume a secondary structural conformation; 

D. subjecting the DNA molecule to 
electrophoresis under non-denaturing conditions; 

E. comparing the electrophoresis pattern of 
25 the single-stranded molecule with the electrophoresis 

pattern of a single-stranded molecule containing the 
same amplified exon from a control individual which has 
either a normal or PKDl heterozygous genotype; and 

F. sequencing any amplification product which 
30 has an el ectrophoretic pattern different from the 

pattern obtained from the DNA of the control 
individual . 

The invention also extends to a diagnostic kit for 
carrying out a method as set out above, comprising 
35 nucleic acid primers for amplifying a fragment of the 
- DNA or R'NA sequences defined above! The nucleic acid 
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primers may comprise at least one of the following 
sets : 

AH3 F9 : AH3 B7 
5 3 A3 CI : 3A3 C2 

AH4 F2 : JH14 B3 

Another embodiment of kit may combine one or more 
substances for digesting a sample to provide EcoRI 

10 fragments and a DNA probe as previously defined, 

A further embodiment of kit may comprise one or 
more substances for digesting a sample to provide BamHI 
fragments and a DNA probe as previously defined. 

Still further, a kit may include a nucleic acid 

15 probe capable of hybridising to the DNA or RNA molecule 
previously defined. 

A vector (such as Bluscript (available from 
Stratagene)) comprising a nucleic acid sequence set out 
above? and a host cell (such' as E. coli strain SL-1 

20 Blue (available from Stratagene)) transfected or 
transformed with the vector are also provided, together 
with the use of such a vector or a nucleic acid 
sequence set out above in gene therapy and/or in the 
preparation of an agent for treating or preventing a 

25 PKDl-aesociated disorder. Therefore there is further 
provided a method of treating or preventing a PKDl- 
associated disorder which method comprises 
administering to a patient in need thereof a functional 
PKDl gene to affected cells in a manner that permits 

30 expression of PKDl protein therein and/or a transcript 
produced from a mutated chromosome (such as the deleted 
WS-212 chromosome) which is capable of expressing 
functional PKDl protein therein. 

The invention also extends to any inventive 

35 combination of features set out a.bove or in the 
following di^scription . 
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Brief Description Of The Drawings 

Figure la (top): A long range map of the terminal 
region of the short arm of chroraosorae 16 showing the 
PKDl candidate region defined by genetic linkage 
analysis. The positions of selected DNA probes and 

5 microsate 11 ites used for haplotype, lindage or 
heterozygosity analyses are indicated. Markers 
previously described in linkage disequilibrium studies 
are shown in bold (from: Harris, et al . , 1990; Harris, 
et al., 1991; Germino, et al . , 1992; Somlo, et al . , 

10 1992; Feral, et al , , 1994; Snarey, et al . , 1994). 

(bottom): A detailed map of the distal part of 
the PKDl candidate region showing: the area of 16pl3.3 
duplicated in 16pl3.1 (hatched); C, Cla I restriction 
sites; the breakpoints in the somatic ceil hybrids, N- 

15 OKI and P-MWH2A; DNA probes and the TSC2 gene. The 
limits of the position of the translocation breakpoint 
found in family 77 (see b), determined by evidence of 
heterozygosity (in 77-4) and PFGE (see c and text) is 
also indicated. The contig covering the 77 breakpoint 

20 region consists of the cosraids : 1, CW9D; 2, ZDS5; 3, 
JH2A; 4, REP59; 5, JC10.2B; 6, CWIOIII; 1, SM25A; 8, 
SMII; 9, NM17. 

Figure lb: Pedigree of family 7 7 which segregates 
a 16;22 translocation; showing the chromosomal 

25 composition of each subject. Individuals 77-2 and 77-3 
have the balanced products of the exchange - and have 
PKDl; 77-4 is monosomic for 1 6pl3 . 3-- > 1 6pter and 
22qll.21-->22pter - and has TSC. 

Figure Ic: PFGE of DNA from members of the 77 

30 family: 77-1 (1); 77-2 (2); 77-3 (3); 77-4 (4); 
digested with Cla I and hybridised with SM6. In 
addition to the normal fragments of 340 and partially 

I 

digested fragment of 480 kb a proximal breakpoint 
fragment of approximately 100 ):b (arrowed) is seen in 
35 individuaJs, 77-2, 77-3iaTid 77-'4;- concordant wiEh 
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segregation of the der(16) chroinosonie. 

Figure 2: FISH of the cosmid CWIOIII (cosinid 6; 
Figure la) to a normal male metaphase. Duplication of 
this locus is illustrated with two sites of 

5 hybridisation on 16p; the distal site (the PKDl region) 
is arrowed. The signal from the proximal site 
(16pl3.1) is stronger than that from the distal, 
indicating that sequences homologous to CWIOIII are re- 
iterated in 16pl3.1. 

10 Figure 3a: A detailed map of the 77 translocation 

region showing the precise localisation of the 77 
breakpoint and the region that is duplicated in 16pl3.1 
(hatched). DNA probes (open boxes); the transcripts ^ 
PKDl and TSC2 (filled boxes; with direction of 

15 transcription indicated by an arrow) and cDNAs (grey 
boxes) are shown below the genomic map. The known 
genomic extent of each gene is indicated at the bottom 
of the diagram and the approximate genomic locations of 
each cDNA is indicated under the genomic map. The 

20 positions of genomic deletions found in PKDl patients, 
0X875 and 0X114, are also indicated. Restriction sites 
for EcoR I (E) and incomplete maps for BamH- I (B); Sac 
I (S) and Xba I (X) are shown. SM3 is a 2kb BamHl 
fragment shown at the 5' end of the gene. 

25 Figure 3b: Southern blots of BaroH I digested DNA 

from individuals: 77-1 (1); 77-2 (2); and 77-4 (4) 
hybridised with: left panel, 8S3 and right panel, 8S1 
(see a). 8S3 detects a novel fragment on the telomeric 
side of the breakpoint (12 kb : arrowed) associated 

30 with the der(22) chromosome in 77-2, but not 77-4; 
8S1 identifies a novel fragment on the centromeric side 
of the breakpoint (9 kb: arrowed) - associated with the 
der(16) chromosome - in 77-2 and 77-4. The telomeric 
breakpoint fragment is also seen weakly with 8S1 

35 (arrowed) indicating that the breakpoint lies in the 
distal part of 8 5.1. The BS3 and 8S1 loci are both 
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duplicated; the normal BaraH I fragment detected at the 
16pl3.3 site by these probes is 11 kb (see a), but a 
similar sized fragment is also detected at the 16pl3.l 
site. Consequently, the breakpoint fragments are much 
5 fainter than the normal (16pl3.1 plus 16pl3,3) band. 

Figure 4a: PBP cDNA, 3A3, hybridised to a Northern 
blot containing "1 mg polyA selected raRNA per lane of 
the tissue specific cell lines: lane 1, MJ, EBV- 
transformed lymphocytes; lane 2, K562, 
10 erythroleukaemia; lane 3, FSl, normal fibroblasts; lane 
4, HeLa, cervical carcinoma; lane 5, G401, renal Wilm's 
tumour; lane 6, Hep3B, hepatoma; lane 7, HT29, colonic 
adenocarcinoma; lane 8, SW13, adrenal carcinoma; lane 
9, G-CCM, astrocytoma. A single transcript of 
15 approximately 14 kb is seen; the highest level of 
expression is in fibroblasts and in the astrocytoma 
cell line, G-CCM. Although in this comparative 
experiment little expression is seen in lanes 1, 4 and 
If we have demonstrated at least a low level of 
20 expression in these cell lines on other Northern blots 
and by RT-PCR (see later). 

Figure 4b: A Northern blot containing " 20 mg of 
total RNA from the cell line G-CCM hybridised with 
cDNAs or a genomic probe which identify various parts 
25 of the PBP gene. Left panel, a single "14 kb 
transcript is seen with a cDNA from the single copy 
area, 3A3 . Right panel, a cDNA, 21P.9, that is 
homologous to parts of the region that is duplicated 
(JH12, JH8 and JHIO; see Figure 3a) hybridises to the 
30 PBP transcript and three novel transcripts; HG-A (' 21 
kb), HG-B (- 17 kb) and HG-C (8.5 kb) . A similar 
pattern of transcripts is seen with cDNAs and genomic 
fragments that hybridise to the area between JH5 and 
JH13, with the exception of the JH8 area. Middle 
35 panel, JH8 hybridises to the transcripts PBP, HG-A and 
HG-B but not to HG-C. ' • ' » '--'w.. 
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Figure 4c: A Northern blot of 20mg total 
fibroblast RNA from: normal control (N); 77-2 (2); 77-4 
(4) hybridised with 8S1, which contains the 16;22 
translocation breakpoint (see Figure 3). A transcript 

5 of " 9 kb (PBP-77) is identified in the two patients 
with this translocation but not in the .normal control. 
PBP-77 is a chimeric PBP transcript formed due to the 
translocation and is not seen in 77-2 or 77-4 RNA with 
probes which map distal to the breakpoint. 

10 Figure 5a: FIGE of DNA from: normal (N) and ADPKD 

patient 0X875 (875), digested with EcoR I and 
hybridised with, left panel, CWIO; middle panel, JHl . 
Normal fragments of 41 kb (plus a 31 kb fragment from 
the 16pl3.1 site), CWIO, and 18 kb, JHI, are identified 

15 with these probes; 0X875 has an additional 53 kb band 
(arrowed). The EcoR I site separating these two 
fragments is removed by the deletion (see Figure 3a). 
The right panel shows a Southern blot of BamH I 
digested DNA (as above) hybridised with 1A1H.6. A 

20 novel fragment of 9.5 kb is seen in 0X875 DNA, as well 
as the normal 15 kb fragment. These results indicate 
that 0X875 has a 5.5 kb deletion; its position was 
determined more precisely by mapping relative to two 
Xba I sites which flank the deletion (see figure 3a). 

25 Figure 5b: Northern blot of total fibroblast RNA, 

as (a), hybridised with the cDNAs , AH4, 3A3 and AH3 . A 
novel transcript (PBP-875) of ' 11 kb is seen with AH4 
(the band is reduced in intensity because the probe is 
partly deleted) and AH3 (arrowed), which flank the 

30 deletion, but not 3A3 which is entirely deleted (see 
figure 3a). The transcripts HG-A, HG-B and HG-C, from 
the duplicated area, are seen with AH3 (see figure 4b). 

Figure 5c: Left panel; FIGE of DNA from: normal 
(N) and ADPKD patient 0X114 (114), digested with EcoR I 

35 and hybridised with CWIO; a novel fragment of 39 kb 
(arrowed) is seen in 0X114: Middle panel; DNA, as 
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above, plus the normal mother (M) and brother (B) of 
0X114 digested with BamH I and hybridised with CW21. A 
larger than normal fragment of 19 kb (arrowed) was 
detected in 0X114 but not other family members due to 
deletion of a BamH I site; together these results are 
consistent with a 2 kb deletion (see Figure 3a). Right 
panel; RT-PCR of RNA, as above, with primers flanking 
the 0X114 deletion (see Experimental Procedures). A 
novel fragment of 810 bp (arrowed) is seen in 0X114, 
indicating a deletion of 446 bp in the PBP transcript. 

Figure 5d: RT-PCR of RNA from: ADPKD patient 0X32 
(32) plus the probands, normal mother (M) and affected 
father (F) and sibs (1) and (2) using the C primer pair 
from 3A3 (see Experimental Procedures). A novel 
fragment of 125 bp is detected in each of the affected 
individuals , 

Figure 6: Map of the region containing the TSC2 
and PBP genes showing the area deleted in patient WS-53 
and the position of the 77 translocation breakpoint. 
Localisation of the distal end of the WS-53 deletion 
was previously described (European Chromosome 16 
Tuberous Sclerosis Consortium, 1993) and we have now 
localised the proximal end between SM5 and JHI7. The 
size of the aberrant Mlu I fragment in WS-53, detected 
by JHl and JH17, is 90kb and these probes lie on 
adjacent Mlu I fragments of 120kb and 70kb, 
respectively. Therefore the WS-53 deletion is ' lOOkb. 
Restriction sites for: Mlu I (M); Nru I (R); Not I (N); 
and partial maps for Sac II (S) and BssH II (H) are 
shown. DNA probes (open boxes) and the TSC2 and PBP 
transcripts (filled boxes) are indicated below the line 
with their known genomic extents (brackets). The 
locations of the microsatellites KG8 and SM6 are also 
indicated . 

Figure 7:.- 5'he partial nucleotide sequence (cDNa) 
of the PKDl transcript extending 563lbp to uhe 3' end 
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of the gene. The corresponding predicted protein (also 
shown in SEQ ID NO: 4:) is shown below the sequence and 
extends from the start of the nucleotide sequence. The 
GT-repeat, KG8, is in the 3* untranslated region 
5 between 5430-5448 bp. This sequence corresponds to 
GenBank Accession No. L33243 and is shown in SEQ ID NO; 
3 : . 

Figure 8: The sequence of the probe 1A1H0.6 (also 
shown in SEQ ID NO: 5:). 

10 Figure 9: The sequence (SEQ ID NO: 6:) of the 

probe CWIO which is about O.Skb. 

Figure 10: The larger partial nucleotide sequence 
(SEQ ID NO: 1:) of the PKDl transcript (cDNA) extending 
from bp 2 to 13807bp to the 3' end of the gene together 

15 with the corresponding predicted protein (also shown in 
SEQ ID NO: 2:). This larger partial sequence 
encompasses the (smaller) partial sequence of Figure 7 
from amino acid no. 2726 in SEQ ID NO: 3: and relates 
to the entire PKDl gene sequence apart from its extreme 

20 5' end. 

Figure 11: A map of the 7 5bp intron amplified by 
the primer set 3A3C insert at position 3696 of the 3' 
sequence showing the positions of genomic deletions 
found in PKDl patients 461 and OX1054, 

25 Figure 12: A map of the region of chromosome 16 

containing the TSC2 and PKDl genes - showing the areas 
affected in patients WS-215, WS-250, WS-212, WS-194, 
WS-227 and WS-219; also WS-53 (but cf . Figure 6). 
Genomic sites for the enzymes Mlul (M) , Clal (C) , Pvul 

30 (P) and Nrul (R) are shown. Positions of single copy 
probes and cosmids used to screen for deletions are 
shown below the line which represents '400kb of genomic 
DNA. The genomic distribution of the approximately 
45kb TSC2 gene and known extent of the PKDl gene are 

35 indicated above. The hatched area respresents an "50kb 
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region which is duplicated more proximally on 
chromosome 16p, 

Detailed Description of the Drawings 
A translocation associated with ADPKD 

5 A major pointer to the identity of the PKDl gene 

was provided by a Portuguese pedigree (family 77) with 
both ADPKD and TSC (Figure lb). Cytogenetic analysis 
showed that the mother, 77-2, has a balanced 
translocation, 46XX t ( 16 ; 22 ) (pl3 . 3 ; qll , 21 ) which was 

10 inherited by her daughter, 77-3. The son, 77-4, has 
the unbalanced karyotype, 45XY-16-22+der ( 16 ) ( 16qter — > 
16pl3.3; :22qll.21 — >2qter) and consequently is mono- 
somic for 16pl3,3 — >16pter as well as for 22qll.21 — > 
22pter. This individual has the clinical phenotype of 

15 TSC (see Experimental Procedures); the most likely 
explanation is that the TSC2 locus located within 
16pl3.3 is deleted in the unbalanced karyotype. 

Further analysis revealed that the mother (77-2), 
and the daughter (77-3) with the balanced 

20 translocation, have the clinical features of ADPKD (see 
Experimental Procedures), while the parents of 77-2 
were cytogenetically normal, with no clinical features 
of TSC and no renal cysts on ultrasound examination 
(aged 67 and 82 years). Although kidney cysts can be a 

15 feature of TSC, no other clinical signs of TSC were 
identified in 77-2 or 77-3, making it unlikely that the 
polycystic kidneys were due to TSC. We therefore 
investigated the possibility that the translocation 
disrupted the PKDl locus in 16pl3.3 and proceeded to 

iO identify and clone the region containing the 
breakpoint. 

The 77 family was analysed with polymorphic 
markers from 16pl3.3. Individual 77-4 was heraizygous 
for MS205.2 and GGGl, but heterozygous for SM6 and more 
5 proximal markers, locating the translocation breakpoint 
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between GGGl and SM6 (see Figure la). Fluorescence in 
situ hybridisation (FISH) of a cosmid from the TSC2 
region, CW9D (cosmid 1 in Figure la), to metaphase 
spreads showed that it hybridised to the der(22) 

5 chroraosorae of 77-2; placing the breakpoint proximal to 
CW9D and indicating that 77-4 was hemizygous for this 
region consistent with his TSC phenotype . DNA from 
members of the 77 family was digested with Cia I, 
separated by PFGE and hybridised with SM6; revealing a 

10 breakpoint fragment of ~ 100 kb in individuals with the 
der(16) chromosome (Figure Ic). The small size of this 
novel fragment enabled the breakpoint to be localised 
distal to SM6 in a region of just 60 kb (Figure la). A 
cosmid contig covering this region was therefore 

15 constructed (see Experimental Procedures for details). 
The translocation breakpoint lies within a region 
duplicated elsewhere on chromosome 16p (16pl3.1) 

It was previously noted that the region between 
CW21 and N54 (Figure la) was duplicated at a more 

20 proximal site on the short arm of chromosome 16 
(Germino, et al., 1992; European Chromosome 16 
Tuberous Sclerosis Consortium, 1993). Figure 2 shows 
that a cosmid, CWIOIII, from the duplicated region 
hybridises to two points on 16p; the distal, PKDl 

25 region and a proximal site positioned in l,6pl3.1, The 
structure of the duplicated area is complex with each 
fragment present once in.l6pl3.3 re-iterated two-four 
times in 16pl3.1 (see Figure 2). Cosmids spanning the 
duplicated area in 16pl3.3 were subcloned (see Figure 

30 3a and Experimental Procedures for details) and a 
restriction map was generated. A genomic map of the 
PKDl region was constructed using a radiation hybrid, 
Hyl45.19 which contains the distal portion of 16p but 
not the duplicate site in 16pl3.1. 

35 To localise the 77 translocation breakpoint, 

subclones from the target region were hybridised to 77- 
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2 DNA, digested with Cla I and separated by PFGE. Once 
probes mapping across the breakpoint were identified 
they were hybridised to conventional Southern blots of 
77 family DNA. Figure 3b shows tha"t novel BamH I 
fragments were detected from the centromeric and 
telomeric side of the breakpoint, which was localised 
to the distal part of the probe 8S1 (Figure 3a). 
Hence, the balanced translocation was not associated 
with a substantial deletion, and the breakpoint was 
located more than 20 kb proximal to the TSC2 locus 
(Figure 3a). These results supported the hypothesis 
that polycystic kidney disease in individuals with the 
balanced translocation (77-2 and 77-3) was not due to 
disruption of the TSC2 gene, but indicated that a 
15 separate gene mapping just proximal to TSC2, was likely 
to be the PKDl gene. 

The polycystic breakpoint (PBP) gene is disrupted by 
the translocation 

Localisation of the 77 breakpoint identified a 

20 precise region in which to look for a candidate for the 
PKDl gene. During the search for the TSC2 gene we 
identified other transcripts not associated with TSC 
including a large transcript (" 14 kb) partially 
represented in the cDNAs 3A3- and AH4 which mapped to 

25 the genomic fragments CW23 and CW21 (Figure 3a). The 
orientation of the gene encoding this transcript had 
been determined by the identification of a polyA tract 
in the cDNA, AH4 : the 3* end of this gene lies very 
close to the TSC gene, in a tail to tail orientation 

30 (European Chromosome 16 Tuberous Sclerosis Consortium, 
1993). TO determine whether this gene crossed the 
translocation breakpoint genomic probes from within the 
duplicated area and flanking the breakpoint were 
hybridised to Northern blots. Probes from both sides of 

35 the breakpoint, between JH5 and JH13 identified the 14 
kb. transcript (Figure 3a and see below for details). 
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Therefore, this gene previously called 3A3, but now 
designated the PBP gene extended over the 77 
breakpoint and consequently was a candidate for the 
PKDl gene, A walk was initiated to increase the extent 

5 of the PBP cDNA contig and several new cDNAs were 
identified using probes from the single copy (non- 
duplicated) region (see Experimental Procedures for 
details). A cDNA contig was constructed which extended 
"'5.7 kb, including "2 kb into the area that is 

10 duplicated (Figure 3a). 

Expression of the PBP gene 

Initial studies of the expression pattern of the 
PBP gene were undertaken with cDNAs that map entirely 
within the single copy region (e.g. AH4 and 3A3). 

15 Figure 4a shows that the " 14 kb transcript was 
identified by 3A3 in various tissue-specific cell 
lines. From this and other Northern blots we concluded 
that the PBP gene was expressed in all of the cell 
lines tested, although often at a low level. The two 

20 cell lines which showed the highest level of expression 
were fibroblasts and a cell line derived from an 
astrocytoma, G-CCM. Significant levels of expression 
were also obtained in cell lines derived from kidney 
(G401) and liver (HepBB). Measuring the expression of 

25 the PBP gene in tissue samples by Northern blotting 
proved difficult because such a large transcript is 
susceptable to minor RNA degradation. However, initial 
results v;ith an RNAse protection assay, using a region 
of the gene located in the single copy area (see 

30 "Experimental Procedures), showed a moderate level of 
expression of the PBP gene in tissue obtained, from 
normal and polycystic kidney (data not shown). The 
widespread expression of the PBP gene is consistent 
.with the systemic nature of ADPKD. 
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Identification of transcripts that are partially 
homologous to the PBP transcript 

New cDNAs were identified with the genomic 
fragments, JH4 and JH8, that map to the duplicated 

5 region (Figure 3a and see Experimental Procedures). 
However, when these cDNAs were hybridised to Northern 
blots a more complex pattern than that seen with 3A3 
was observed. As well as the "14 kb PBP transcript, 
three other, partially homologous transcripts were 

10 identified designated homologous gene-A (HG-A; * 21 
kb), HG-B (" 17 kb) and HG-C (8.5 kb ) (Figure 4b). 
There were two possible explanations for these results, 
either the HG transcripts were alternatively spliced 
forms of the PBP gene, or the HG transcripts were 

15 encoded by genes located in 16pl3.1. To determine the 
genomic location of the HG loci a fragment from the 3 ' 
end of one HG cDNA (HG-4/1.1) was isolated. HG-4/1.1 
hybridised to all three HG transcripts, but not to the 
PBP transcript and on a hybrid panel it mapped to 

20 16pl3.1 (not the PKDl area). These results show that 
all the HG transcripts are related to each other 
outside the region of homology with the PBP transcript 
and that the HG loci map to the proximal site 
(16pl3. 1) . 

25 An abnormal transcript associated with the 77 
translocation 

As the PBP gene was transcribed across the region 
disrupted by the 77 translocation breakpoint, in a 
proximal to distal direction on the chromosome (see 

30 Figure 3a) it was possible that a novel transcript: 
originating from the PBP promoter would be found in 
this family. Figure 4c shows that using a probe to the 
PBP transcript that mapped mainly proximal to the 
breakpoint, a novel transcript of approximately 9 kb 

35 (PPP-77) derived from the der(16) product of the 
translocation was detected. Interestingly, the PBP-77 
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transcript appears to be expressed at a higher level 
than the normal PBP product. These results confirmed 
that the 77 translocation disrupts the PBP gene and 
supports the hypothesis that this is the PKDl gene. 

5 Mutations of the PBP gene In other ADPRD patients 

To prove that the PBP gene is the defective gene 
at the PKDl locus, we analysed this region for 
mutations in patients with typical ADPKD. The 3' end 
of the PBP gene was most accessible to study as it maps 

10 outside the duplicated area. To screen this region 
BamH I digests of DMA from 282 apparently unrelated 
ADPKD patients were hybridised with the probe 1A1H.6, 
(see Figure 3a). In addition, a large EcoR I fragment 
(41 kb) which contains a significant proportion of the 

15 PBP gene was assayed by field inversion gel 
electrophoresis (FIGE) in 167 ADPKD patients, using the 
probe CWIO. Two genomic rearrangements were identified 
in ADPKD patients by these procedures; each identified 
by both methods . 

20 The first rearrangement was identified in patient 

0X875 (see Experimental Procedures for clinical 
details) who was shown to have a 5.5 kb genomic 
deletion within the 3' end of the PBP gene, producing a 
smaller transcript (PBP-875) (see Figures 5a, b and 3a 

25 for details). This genomic deletion results in a "3 kb 
internal deletion of the transcript with the "500 bp 
adjacent to the polyA tail intact. In this family 
linkage of ADPKD to chromosome 16 could not be proven 
because although 0X875 has a positive family history of 

30 ADPKD there were no living, affected relatives. 
However, paraffin-embedded tissu *"rora her affected 
father (now deceased) was available. We demonstrated 
that this individucil had the same rearrangement as 
0X875 by PCR amplification of a 220bp fragment spanning 

35 the deletion (data not 'shown). This result and 
analysis of two unaffected sibs of 0X875, that did not 
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have the deletion, showed that this mutation was 
transmitted with ADPKD. 

The second rearrangement detected by hybridisation 
was a 2 kb genomic -deletion within the PBP gene, in 
5 ADPKD patient 0X114 (see Experimental Procedures for 
clinical details and Figures 5c and 3a). No abnormal 
PBP transcript was identified by Northern blot 
analysis, but using primers flanking the deletion (see 
Experimental Procedures) a shortened product was 
10 detected by RT-PCR (Figure 5c). This was cloned and 
sequenced and shown to have a frame-shift deletion of 
446 bp (between base pair 1746 and 2192 of the sequence 
shown in Figure 7). 0X114 is the only member of the 
family with ADPKD (she has no children) and ultrasound 
15 analysis of her parents at age 78 (father) and 73 years 
old (mother) showed no evidence of renal cysts. 
Somatic cell hybrids were produced from 0X114 and the 
deleted chromosome was found to be of paternal origin 
by haplotype analysis. The father of 0X114 is now 
20 deceased but analysis of DNA from the brother of 0X114 
(0X984) with seven microsatellite markers from the PKDl 
region (see Experimental Procedures) showed that he 
shares the same paternal chromosome, in the PKDl 
region, as 0X114. Renal ultrasound revealed no cysts 
25 in 0X984 at age 53 and no deletion was detected by DNA 
analysis (Figure 5c). Hence, the deletion in 0X114 is 
a de novo event associated with the development of 
ADPKD. Although it is not possible to show that the 
ADPKD is chromosome 16-linked, the location of the PBP 
30 gene indicates that this is a de novo PKDl mutation. 

To identify more PKDl associated mutations, single 
copy regions of the PBP gene were analysed by RT-PCR 
using RNA isolated from lyraphoblastoid cell lines 
established from ADPKD patients, cDNA from 48 unrelAted 
35 patients was amplified with the primer pair 3A.3 C (see 
Experimental Procedures) and the product of 260 bp was 
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analysed on an agarose gel. In one patient, 0X32, an 
additional smaller product (125 bp) was identified, 
consistent with a deletion or splicing mutation. 0X32 
comes from a large family in which the disease can be 
5 traced through three generations. Analysis of RNA from 
two affected sibs of 0X32 and his parents showed that 
the abnormal transcript segregates with PKDl (Figure 
5d) . 

Amplification of normal genomic DNA with the 3A3 C 

10 primers generates a product of 418 bp; sequencing 
showed that this region contains two small introns (5', 
75 bp and 3', 83 bp) flanking a 135 bp exon. The 
product amplified from 0X32 genomic DNA was normal in 
size, excluding a genomic deletion. However, 

15 heteroduplex analysis of that DNA revealed larger 
heteroduplex bands, consistent with a mutation within 
that genomic interval. The abnormal 0X32, RT-PCR 
product was cloned and sequenced: this demonstrated 
that, although present in genomic DNA, the 135 bp exon 

20 was missing from the abnormal transcript. Sequencing 
of 0X32 genomic DNA demonstrated a G — >C transition at 
+1 of the splice donor site following the 135 bp exon. 
This mutation was confirmed in all available affected 
family members by digesting amplified genomic DNA with 

25 the enzyme Bst NI: a site is destroyed by the base 
substitution. The splicing defect results in an in- 
fraiae deletion of 135 bp from the PBP transcript (3696 
bp to 3831 bp of the sequence shown in Figure 7). 
Together, the three intragenic mutations confirm that 

30 the PBP gene is the defective gene at the PKDl locus. 
Deletions that disrupt the TSC2 and the PKDl gene 

We previousiv identified a deletion (WS-53) which 
disrupts the TSC2 gene and the PKDl gene (European 
Chromosome 16 Tuberous Sclerosis Consortium, 1993), 

?5 although its full proximal extent was not detiermined. 
Further study has shown that the deletion extends ' 100 
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kb (see Figure 6 for details) and deletes most if not 
all of the PKDl gene. This patient has TSC but also 
has unusually severe polycystic disease of the kidneys. 
Other patients with a similar phenotype have also been 

5 under investigation. Deletions involving both TSC2 and 
PKDl were identified and characterised in six patients 
in whom TSC was associated with infantile polycystic 
kidney disease. As well as the deletion in WS-53, 
those in WS-215 and "S-250 also extended proximally 

10 well beyond the known distribution of PKDl and probably 
delete the entire gene. The deletion in WS-194 
extended over the known extend of PKDl, but not much 
further proximally, while the proximal breakpoints in 
WS-219 and WS-227 lay within PKDl itself. Northern 

15 analysis of case WS-219 with probe JH8, which lies 
outside the deletion, showed a reduced level of the 
PKDl transcript but no evidence of an abnormally sized 
transcript (data now shown) . Analysis of samples from 
the clinically unaffected parents of patients WS-53, 

20 WS-215, WS-219, WS-227 and WS-250 showed the deletions 
in these patients to be de novo. The father of WS-194 
was unavailable for study. 

In a further case (WS-212), renal ultrasound 
showed no cysts at four years^of age but a deletion was 

25 identified which removed the entire TSC2 gene and" 
deleted an Xbal site which is located 42bp 5' to the 
polyadenylation signal of PKDl. To determine the 
precise position of the proxiinal breakpoint in PKDl, a 
587bp probe from the 3' untranslated region (3*UTRP) 

30 was hybridised to Xbal digested DMA. A 15kb XbaL 
breakpoint fragment was detected with an approximately 
equal intensity to the normal fragment of 6kb, 
indicating that most of the PKD13'UTR was preserved on 
the mutant chromosome. Evidence that a PKDl transcript 

35 is produced from the deleted chromosome in WS-:212 was 
obtained by 3' rapid identification of cDNA ends (RACE) 
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with a novel, smaller product generated from WS-212 
cNDA. Characterisation of this product showed that 
polyadenylation occurs 546bp 5* to the normal position, 
within the 3'UTR of PKDl (231bp 3' to the stop codon at 

5 5073bp of the described PKDl sequence ^^). A 
transcript with an intact open reading frame is thus 
produced from the deleted WS-212 chromosome. It is 
likely that a functional PKDl protein in produced from 
this transcript, explaining the lack of cystic disease 

10 in this patient. The sequence preceeding the novel 
site of polyA addition is: 

AGTCAGT AATTTA TATGGTGTTAAAATGTG ( A) n . Although not 
conforming precisely to the concensus of AATAAA, it is 
likely that part of this AT rich region acts as an 

15 alternative polyadenylation signal if, as in this case, 
the normal signal is deleted (a possible sequence is 
underlined) . 

The WS-212 deletion if 75kb between SM9-CW9 
distally and the PKDl 3'UTR proximally. The WS-215 

20 deletion is 150kb between CW15 and SM6-JH17 . WS-194 
has 65kb deleted between CW20 and CW10-CW36. WS-227 
has a 50kb deletion between CW20 and JHll and WS-219 
•has a 27kb deletion between JHl and JH6. The distal 
end of the WS-250 deletion is in CW20 but the precise 

25 location of the proximal end is not known. However, 
the same breakpoint fragment of 320kb is seen with 
Pvul-digest ed DNA using probes on adjacent Pvul 
fragments, CE18 (which normally detects a 245kb 
fragment) and BLu24 (235kb). Hence this deletion can 

30 be estimated "160kb. b. PFGE analysis of the deletion 
in WS-219. Mlul digested DNA from a normal control (N) 
and WS-219 probed with the clones H2, JHl, CW21 and 
CWIO which detect an "130kb fragment in normal 
individuals. CWIO also detects a much smaller fragment 

35 froTti the duplicated region situated more proximally on 
16p. A novel fragment of "lOOkb is seen in W5-219 with 



wo 95/18225 



PCT/GB94/02822 



- 28 - 

probes H2 and CWIO which flank the deletion in this 
patient. JHl is partially deleted but detects the 
novel band weakly. The aberrant fragment is not 
detected by Cff-21, which is deleted on the mutant 
5 chromosome. BamHl digested DNA of normal control (N) 
and WS-219 separated by conventional gel 
electrophoresis and hybridised to probes JHl and JH6 
which flank the deletion. The same breakpoint fragment 
of '3kb is seen with both probes, consistent with a 
10 deletion of "27kb ending within the BamHl fragments 
seen by these probes. 
Two further deletions 

In addition we have characterised two further 
mutations of this gene which were identified in typical 
15 PKDl families. In both cases the mutation is a 
deletion in the 75bp intron amplified by the primer 
pair 3A3C (European Polycystic Kidney Disease 
Consortium, 1994). The deletions are of 18bp and 20bp, 
respectively, in the patients 461 and OX1054. Although 
20 these deletions do not disrupt the highly ' conserved 
sequences flanking the exon/intron boundaries, they do 
result in aberrant splicing of the transcript. In both 
cases, two abnormal mRNAs are produced, one larger and 
one smaller than normal . Sequencing of these cDNAs 
25 showed that the larger transcript includes the deleted 
intron, and so has an in-frame insertion of 57bp in 
461, while OX1054 has a frameshift insertion of 55bp. 
The smaller transcript is due to activation of a 
cryptic splice site in the exon preceding the deleted 
30 intron and results in an in-frame deletion of 66bp in 
both patients. The demonstration of two additional 
mutations of this gene in PKDl patients further 
confirms that this is the PKDl gene. 
Characterisation of the PKDl gene 
35 To characterise the =PKD1 ge ne ' f u r t he r , 

evolutionary cons exva fion was ana ly s e'ci'"by' '-zo.c 
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blotting'. Using probes from the single copy, 3* 
region <3A3) and from the duplicated area (JH4, JHS) 
the PKDl gene was conserved in other mammalian species, 
including horse, dog, pig and rodents (data not shown). 

5 No evidence of related sequences were seen in chicken, 
frog or drosophila by hybridisation at normal 
stringency. The degree of conservation was similar 
when probes from the single copy or the duplicated 
region were employed. 

10 The full genomic extent of the PKDl gene is not 

yet known, although results obtained by hybridisation 
to Northern blots show that it extends from at least as 
far as JH13- Several CpG islands have been localised 
5' of the known extent of the PKDl gene (Figure 6), 

15 although there is no direct evidence that any of these 
are associated with this gene. 

The cDNA contig extending 5631 bp to the 3" end of 
the PKDl transcript was sequenced; where possible more 
than one cDNA was analysed and in all regions both 

20 strands were sequenced (Figure 7). We estimated that 
this accounts for *40% of the PKDl transcript. An 
open reading frame was detected which runs from the 5 ' 
end of the region sequenced and spans 4842 bp, leaving 
a 3' untranslated region of 789 bp which contains the 

25 previously described microsate llite , KG8 (Peral, et 
al., 1994; Snairey, et al . , 1994). A polyadenylation 
signal is present at nucleotides 5598-5603 and a polyA 
tail was detected in two independent cDNAs (AH4 and 
AH6) at position, 5620. Comparison with the cDNAs HG- 

30 4 and 11BHS21, which are encoded by genes in the 
duplicate, 15pl3 . 1 region, show that 1866 bp at the 5' 
end of the partial PKDl sequence shown in Figure 7 lies 
within the duplicated area. The predicted amino acid 
sequence from the available open reading frame extends 

35 1614 residues and ip shown in Figure 7. A search of 
the sv;iccprot and NBRF daLa bases with the available 
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protein sequence, using the Blast programme (Altschul, 
et al , , 1990) identified only short regions of 
similarity (notably, between amino-acids 590-770 and 
1390-1530) to a diverse group of proteins; no highly 

5 significant areas of homology were recognised. The 
importance of the short regions of similarity is 
unclear as the search for protein motifs with the 
ProSite Programme did not identify any recognised 
functional protein domains within the PKDl gene. 

10 The task of identifying and characterising the 

PKDl gene has been more difficult than for other 
disorders because more than three quarters of the gene 
is embedded in a region of DNA that is duplicated 
elsewhere on chromosome 16. This segment of 40-50 kb 

15 of DNA, present as a single copy in the PKDl area 
(16pl3.3), is re-iterated as several divergent copies 
in the more proximal region, 16pl3.1. This proximal 
site contains three gene loci (HG-A, -B and -C) that 
each produce polyadenylated mRNAs and share substantial 

20 homology to the PKDl gene; it is not known whether 
these partially homologous transcripts are translated 
into functional proteins. 

Although gene amplification is known as a major 
mechanism for creating protein diversity during 

25 evolution, the discovery -of a human disease locus 
-embedded within an area duplicated relatively recently 
is a new observation. In this case because of the 
recent nature of the reiteration the whole duplicated 
genomic region retains a high level of homology, not 

30 just the exons. The sequence of events leading to the 
duplication and which sequence represents the original 
gene locus are not yet clear. However, early evidence 
of homology of the 3' ends of the three HG transcripts 
which are different from the 3 ' end of the PKDl gene 

35 indicated that the loci in 16pl3.1 have probably arisen 
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by further reiteration of sequences at this site, after 
it separated from the distal locus. 

To try to overcome the duplication problem we have 
employed an exon linking approach using RNA isolated 

5 from a radiation hybrid, Hyl45.19, that contains just 
the PKDl part of chromosome 16, and not the duplicate 
site in 16pl3,l. Hence, this hybrid produces 
transcripts from the PKDl gene but not from the 
homologous genes (HG-A, HG-B and HG-C). We have also 

10 sequenced much of the genomic region containing the 
PKDl gene, from the cosmid JH2A, and have sequenced a 
number of cDNAs from the HG locus. To determine the 
likely position of PKDl exons in the genomic DNA we 
compared HG cDNAs, (HG-4 and HG-7 ) to the genomic 

15 sequence. We then designed primers with sequences 
corresponding to the genomic DNA, to regions identified 
by the HG exons and employing cDNA generated from the 
hybrid Hyl45.19, we amplified sections of the PKDl 
transcript. The polymerase Pfu was used to minimise 

20 incorporation errors. These amplified fragments were 
then cloned and sequenced. The PDKl cDNA contig whose 
sequence is shown in Figure 10 is made up of ( 3 ' -5 ' ) 
the original 5.7 kb of sequence shown in Figure 7, and 
the cDNAs : gap a 2 2 (8 90 bp), gap gamma (87 2 bp), a 

25 section of genomic DNA from the clone JH8 (2,724 bp) 
which cor.responds to a large exon, S1-S3 (733 bp), S3- 
S4 (1,589 bp) and S4-S13 (1,372 bp). Together these 
make a cDNA of 13,807 bp with the extreme 5' end of the 
transcript still uncharacterised . When these cDNAs 

30 from the PKDl contig were sequenced an open reading 
frame was found to run from the start of the contig to 
the previously-identified stop codon, a region of 
13,018 bp. The predicted protein encoded by the PKDl 
transcript is also shown in Figure 10 and has 4,339 

35 amino acid residues. 
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We have therefore compelling evidence that 
mutations of the PKDl gene give rise to the typical 
phenotype of ADPKD. The location of this gene within 
the PKDl candidate region and the available genetic 

5 evidence from the families with mutations show that 
this is the PKDl gene. The present invention therefore 
includes the PKDl gene itself and the six PKDl- 
associated mutations which have been described: a de 
novo translocation, which was subsequently transmitted 

10 with the phenotype; two intragenic deletions (one a de 
novo event); two further deletions; and a splicing 
defect. 

It has previously been argued that PKDl could be 
recessive at the cellular level, with a second somatic 

15 mutation required to give rise to cystic epithelium 
(Reeders, 1992). This "two hit" process is thought to 
be the mutational mechanism giving rise to several 
dominant diseases, such as neurofibromatosis (Legius, 
et al., 1993) and tuberous sclerosis (Green, et al., 

20 1994) which result from a defect in the control of 
cellular growth. If this were the case, however, we 
might expect that a proportion of constitutional PKDl 
mutations would be inactivating deletions as seen in 
these other disorders. 

25 The location of the PKDl mutations may, however, 

reflect some ascertainment bias as it is this single 
copy area which has been screened most intensively for 
mutations. Nevertheless, no additional deletions were 
detected when a large part of the gene was screened by 

30 FIGE, and studies by PFGE showed no large deletions of 
this area in 75 PKDl patients. It is possible that the 
mutations detected so far result in the production of 
an abnormal protein which causes disease through a gain 
of function. However, it is also possible that these 

35 mutations eliminate the production of functional 
•protein from "this- • chromosome and result in the PKDi 
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the second PKDl homologue by somatic mutation. 

At least one mutation which seems to delete the 
entire PKDl gene has been identified (WS-53) but in 

5 this case it also disrupts the adjacent TSC2 gene and 
the resulting phenotype is of TSC with severe cystic 
kidney disease. Renal cysts are common in TSC so that 
the phenotypic significance of deletion of the PKDl 
gene in this case is difficult to assess. It is clear 

10 that not all cases of renal cystic disease in TSC are 
due to disruption of the PKDl gene; chromosome 9 linked 
TSC (TSCl) families also manifest cystic kidneys and we 
have analysed many TSC2 patients with kidney cysts who 
do not have deletion of the PKDl gene. 

15 Preliminary analysis of the PKDl protein sequence 

has highlighted two regions which provide some clues to 
the possible function of the PKDl gene. At the extreme 
5" end of the characterised region are two leucine-rich 
repeats ( LRRs ) (amino acids 29-74) flanked by 

20 characteristic amino flanking (amino acids 6-28) and 
carboxy flanking sequences (amino acids 76-133) 
(Rothberg et al, 1990). LRRs are thought to be 
involved in protein-protein interations (Kobe and 
Deisenhofer, 1994) and the flanking sequences are only 

25 found in extracellular proteins. Other proteins with 
LRRs flanked on the amino and carboxy sides are 
receptors or are involved in adhesion or cellular 
signalling. Further 3' on the protein (amino acids 
350-515) is a C-type lectin domain (Curtis et ai, 

30 1992). This indicates that this region binds 
carbohydrates and is also likely to be extracellular. 
These two regions of homology indicate that the 5* part 
of the PKDl protein is extracellular and involved in 
protein-protein interactions. It is possible that this 

35 protein is a constituent of, or plays a role in 
assembling, the extracellular matrix (ECM) and may act 
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as an adhesive protein in the ECM. It is also possible 
that the extracellular portion of this protein is 
important in signalling to other cells. The function 
of much of the PKDl proT:ein is still not fully known 
5 but the presence of several hydrophobic regions 
indicates that the protein may be threaded through the 
cell membrane. 

Familial studies indicate that de novo mutations 
probably account for only a small minority of all ADPKD 
10 cases; a recent study detected 5 possible new mutations 
in 209 families (Davies, et al . , 1991). However, in 
our study one of three intragenic mutations detected 
was a new mutation and the PKDl associated 
translocation was also a de novo event. Furthermore, 
15 the mutations detected in the two familial cases do not 
account for a significant proportion of the local PKDl. 
The 0X875 deletion was only detected in 1 of 282 
unrelated cases, and the splicing defect was seen in 
only 1 of 48 unrelated cases. Nevertheless, studies of 
20 linkage disequilibrium have found evidence of common 
haplotypes associated with PKDl in a proportion of some 
populations (Peral, et al., 1994; Snarey, et al., 
1994) suggesting that common mutations will be 
identified . 

25 Once a larger range of mutations have been 

characterised it will be possible to evaluate whether 
the type and location of mutation determines disease 
severity, and if there is a correlation between 
mutation and extra-renal manifestations. Previous 

30 studies have provided some evidence that the risk of 
cerebral aneurysms 'runs true' in families (Huston, et 
al., 1993) and that some PKDl families exhibit a 
consistently mild phenotype (Ryynanen, et al . , 1997). 
A recent study has concluded that there is evidence of 

35 .... anticipation in ADPKD families, especially if the 
disease is transmitted through the mother (Fink, et 
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al., 1994). Furthermore, analysis of families with 
early manifestation of ADPKD show that there is a 
significant intra-f amilial recurrence risk and that 
childhood cases are most often transmitted maternally 

5 (Fink, et al . , 1993; Zerres, et al., 1993). This 
pattern of inheritence is reminiscent of that seen in 
diseases in which an expanded trinucleotide repeat was 
found to be the mutational mechanism (reviewed in 
Mandel, 1993). However, no evidence for an expanding 

10 repeat correlating with PKDl has been found in this 
region although such a sequence cannot be excluded. 

There is ample evidence that early pre symptomatic 
diagnosis of PKDl is helpful because it allows 
complications such as hypertension and urinary tract 

15 infections to be monitored and treated quickly (Ravine, 
et al., 1991). The identification of mutations within 
a family will allow rapid screening of that and other 
families with the same mutation. However; genetic 
linkage analysis is likely to remain important for 

20 presymptomatic diagnosis. The accuracy and ease of 
linkage based diagnosis will be improved by the 
identification of the PKDl gene as a raicrosatellite 
lies in the 3' untranslated region of this gene (KG-8) 
and several CA repeats are located 5' of the gene (see 

25 Figure la and 6; Peral, et al., 1994; Snarey, et al., 
1994). 

Experimental Procedures 
Clinical Details of Patients 

Family 77 

30 77-2 and 77-3 are 48 and 17 years old, 

respectively, and have typical ADPKD. Both have 
bilateral polycystic kidneys and 77-2 has impaired 
renal function. Neither patient manifests any signs of 
TSC (apart from cystic kidneys) on clinical and 

35 ophthalmological examination or by CT scan of the 
brain. 
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77-4 is 13 years old, severely mentally retarded 
and has multiple signs of TSC including adenoma 
sebaceum, depigmented macules and periventricular 
calcification on CT scan. Renal ultrasound reveals a 

5 small number of bilateral renal cysts. 
ADPKD patients 

0X875 developed ESRD from ADPKD, aged 46. 
Progressive decline in renal function had been observed 
over 17 years; ultrasound examinations documented 

10 enlarging polycystic kidneys with less extensive 
hepatic cystic disease. Both kidneys were removed 
after renal transplantation and pathological 
examination showed typical advanced cystic disease in 
kidneys weighing 1920g and 3450g (normal average 120g) . 

15 0X114 developed ESRD from ADPKD aged 54: diagnosis 

was made by radiological investigation during an 
episode of abdominal pain aged 25, A progressive 
decline in renal function and the development of 
hypertension was subseguently observed. Ultrasonic 

2 0 examination demonstrated enlarged kidneys with typical 
cystic disease, with less severe hepatic involvement. 

0X32 is a member of a large kindred affected by 
typical ADPKD in which several members have developed 
ESRD. The patient himself has been observed for 12 

25 years with progressive renal failure and hypertension 
following ultrasonic demonstration of polycystic 
kidneys . 

No signs of TSC were observed on clinical 
examination of any of the ADPKD patients. 
30 DMA Electrophoresis and Hybridisation 

DNA extraction, restriction digests, electro- 
phoresis, Southern blotting, hybridisation and washing 
were performed by standard methods or as previously 
described (Harris, et al . , 1990). FIGE was performed 
2Z with the Biorad FIGE Mapper using programme 5 to 
'Separate fragments from 25-50 kb . High molecular 
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weight DNA for PFGE was isolated in agarose blocks and 
separated on the Biorad CHEF DRII apparatus using 
appropriate conditions . 

Genomic DNA probes and somatic cell hybrids 

5 Many of the DNA probes used in this study have 

been described previously: MS205,2 (D16S309; Royle, et 
al., 1992); GGGl (D15S259; Germino, et al . , 1990); N54 
(D16S139; Hiimnelbauer , et al . , 1991); SM6 (D16S665), 
CW2 3, CW21, and JHl (European Chromosome 16 Tuberous 

10 Sclerosis Consortium/ 1993). Microsatellite probes for 
hapiotype analysis were KGB and W5 . 2 (Snarey, et al . , 
1994) SM6, CW3 and CW2, (Peral, et al., 1994), 16AC2.5 
(Thompson, et al., 1992); SM7 (Harris, et al., 1991), 
VK5AC (Aksentijevich, et al., 1993). 

15 New probes isolated during this study were: JH4, 

JH5, JH6, 11 kb, 6 kb and 6 kb BamH I fragments, 
respectively, and JH13 and JH14, 4 kb and 2.8 kb BamH 
I-EcoR I fragments, respectively, all from the cosmid 
JH2A; JH8 and JHIO are 4.5 kb and 2 kb Sac I fragments, 

20 respectively and JH12 a 0,6 Sac I-BamH I fragment, all 
from JH4; 8S1 and BS3 are 2.4 kb and 0.6 kb Sac II 
fragments, respectively, from JH8; CWIO is a 0.5 kb Not 
I-Mlu I fragment of SM25A; JH17 is a 2 kb EcoR I 
fragment of NM17. 

25 The somatic cell hybrids N-OHl (Germino, et al . , 

1990), P-MWH2A (European Chromosome 16 Tuberous 
Sclerosis Consortium, 1993) and Hyl45.19 (Himraelbauer , 
et al., 1991) have previously been described. Somatic 
cell hybrids containing the paternally derived (BP2-10) 

30 and maternally derived (BP2-9) chromosomes from 0X114 
were produced by the method of Deisseroth and Hendrick 
(1979 ) . 

Constructing a cosmid contig 

Cosmids were isolated from chromosome 16 specific 
35 and total genomic libraries, and a contig was 
constructed using the methods and libraries previously 



wo 95/18225 



PCT/GB94/02822 



- 38 - 

described (European Chroxnosome 16 Tuberous Sclerosis 
Consortium, 1993) » To ensure that cosmids were derived 
from the 16pl3.3 region (not the duplicate 16pl3-l 
area) initially, probes £rom the single copy area were 

5 used to screen libraries (e.g. CW21 and N54). Two 
cosmids mapped entirely within the area duplicated, 
CWIOIII and JC10,2B. To establish that these were from 
the PKDl area, they were restriction mapped and 
hybridised with the probe CWIO. The fragment sizes 

10 detected were compared to results obtained with hybrids 
containing only the 16pl3.3 area (Hyl45.19) or only the 
16pl3.1 region (P-MWH2A) . 
FISH 

FISH was performed essentially as previously 

15 described (Buckle and Rack, 1993). The hybridisation 
mixture contained 100 ng of biotin-II-dUTP labelled 
cosmid DNA and 2.5 mg human Cot-1 DNA (BRL) , which was 
denatured and annealled at 37°C for 15 min prior to 
hybridisation at 42°C overnight. After stringent 

20 washes the site of hybridisation was detected with 
successive layers of f luorescein-con jugated avidin (5 
mg/ml) and biotinylated anti-avidin (5 rag/ml) (Vector 
Laboratories). Slides were mounted in Vectashield 
(Vector Laboratories) conta-ining 1 mg/ml propidiura 

25 iodide and 1 mg/ml 4*, 6 -diamidino-2-pheny lindole 
(DAPI), to allow concurrent G-banded analysis under UV 
light. Results were analysed and images captured using 
a Bio-Rad MRC 600 confocal laser scanning microscope. 
cDKA screening and characterisation 

30 Foetal brain cDNAs libraries in 1 phage (Clonetech 

and Stratagene) were screened by standard methods with 
genomic fragments in the single copy area (equivalent 
to CW23 and CW21) or with a 0.8 kb Pvu II-Eco RI single 
copy fragment of AH3 . Six PBP cDNAs were characterised 

35 including two previously described, AH4 (1.7 kb) , 3A3 
(2.0 kb) (European Chromosome 16 Tuberous Sclerosis 
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Consortium, 1993), and four novel cDNAs AH3 (2.2 kb), 
AH6 (2.0 kb), AlC (2.2 kb ) and BIE (2.9 kb ) . A 
Striatum library (Stratagene) was screened with JH4 and 
a HG-C cDNA , 11BHS21 (3.8 kb) was isolated; 2XP,9 is 

5 a 0.9 kb Pvu II-EcoR I subclone of this cDNA. A HG-A 
or HG-B cDNA, HG-4 (7 kb ) was also isolated by- 
screening the foetal brain library (Stratagene) with 
JH8. HG-4 /1. 1 is a 1.1 kb Pvu II-EcoR I fragment from 
the 3* end of HG-4. 1A1H.6 is a 0.6 kb Hind III-EcoR I 

10 subclone of a TSC2 cDNA, lA-1 (1.7 kb), which was 
isolated froni the Clonetech library. Each cDNA was 
subcloned into Bluescript and sequenced utilising a 
combination of sequential truncation and 
oligonucleotide primers using DyeDeoxy Terminators 

15 (Applied Biosystems) and an ABI 373A DNA Sequencer 
(Applied Biosystems) or by hand with 'Sequenase' T7 DNA 
polymerase (USB). 
RKA Procedures 

Total RNA was isolated from cell lines and tissues 

20 by the method of Chomczynski and Sacchi (1987) and 
enrichment for mRNA made using the PolyAT tract mRNA 
Isolation System (Promega) . For RNA electrophoresis 
0.5% agarose denaturing formaldehyde gels were used 
which were Northern blotted, hybridised and washed by 

25 standard procedures. The 0.24 - 9.5 kb RNA (Gibco BRL) 
size standard was used and hybridisation of the probe 
(1-9B3) to the 13 kb Utrophin transcript (Love, et al., 
1989) in total fibroblast RNA was used as a size marker 
for the large transcripts. 

30 RT-PCR was performed with 2.5 mg of total RNA by 

the method of Brown et al (19 90) with random hexamer 
primers, except that AMV-reverse transcriptase (Life 
Sciences) was employed. To characterise the deletion 
of the PEP transcript in 0X114 we used the primers : 

* ' 35 ■ 
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AH3 F9 5 ' TTT GAC AAG CAC ATC TOG CTC TC 3 ' 

AH3 B7 5 ' TAG AGO AGG AGG CTC CGC AG 3 ' 

in a DMSO containing PCR buffer (Dode, et al., 1990) 
5 with 0.5 mM MgClj and 36 cycles of: 94°C, 1 rain; 61°C, 
1 min; 72^C, 2 min plus a final extension of 10 min. 
The 3A3 C primers used to amplify the 0X32 cDNA and DNA 
were: 

3A3 CI 5' CGC CGC TTC ACT AGC TTC GAC 3* 

10 3A3 C2 5* ACG CTC CAG AGG GAG TCC AC 3' 

These were employed in a PCR buffer and cycle 
previously described (Harris, et al . , 1991) with ImM 
MgCl2 and an annealing temperature of 61°C. 

PCR products for sequencing were amplified with 
15 Pfu-1 (Stratagene) and ligated into the Srf-1 site in 
PCR-Script (Stratagene) in the presence of Srf-1. 
RNAse protection 

Tissues from normal and end-stage polycystic 
kidneys were immediately homogenised in guanidinium 

20 thiocyanate. RNA was purified on a cesium chloride 
gradient and 30 mg total RNA was assayed by RNAse 
protection by the method of Melton, et al . , (1984) 
using a genomic template generated with the 3A3, C 
primers . 

25 Heteroduplex Analysis 

Heteroduplex analysis was performed essentially as 
described by Keen et al (1991). Samples were amplified 
from genomic DNA with the 3A3, C primers, heated at 
95^C for 5 minutes and incubated at room temperature 

30 for at least 30 minutes before loading on a Hydrolink 
gel (AT Biochem). Hydrolink gels were run for 12-18 
hours at 250V and fragments observed after staining 
with ethidium bromide. 

Extraction and amplification of paraffin-embedded DNA 

35 DNA from formalin fixed, paraffin wax embedded 

kidney tis.sue was prepared by the method of Wright and 
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ManoB (1990), except that after proteinase K digestion 
overnight at 55°C/ the DNA was extracted with phenol 
plus chloroform before ethanol precipitation. 
Approximately 50 ng of DNA was used for PGR with 1,5 mM 

5 MgCl2 and 40 cycles of 94°C for 1 rain, 59°C for 1 rain 
and 72°C for 40 s, plus a 10 min extension at 72^C. 
The oligonucleotide primers designed to amplify across 
the genomic deletion of OXB75 were: 
AH4F2 : 5 ' - GGG CAA GGG AGG ATG ACA AG - 3 ' 

10 JH14B3 : 5' - GGG TTT ATC AGC AGO AAG CGG - 3' 

which produced a product of " 220 bp in individuals 
with the 0X875 deletion. 
3 'RACE analysis of WS-212 

3* RACE was completed essentially as described 

15 (European Polycystic Kidney Disease Consortium (1994)). 
Reverse transcription was performed with 5pg total RNA 
with O.S^g of the hybrid dT]_^ adapter primer using 
conditions previously described (Fronraan et al., 
(1988)). A specific 3* RACE product was amplified with 

20 the primer F5 adn adapter primer in 0 . 5mM MgCl2 with 
the program: 57^C, 60s; 72°C, 15 minutes and 30 cycles 
of 95*^C, 40s; 57°C, 60s; 72°C, 60s plus 72°C, 10 
minutes. The amplified product was cloned using the TA 
cloning system (Invitrogen) and sequenced by 

25 conventional methods. 
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52 
CLAIMS 



1. An isolated, purified or recombinant nucleic acid 

s eguence compr i s i ng : - 

(a) a PKDl gene or its complGmentary strand, 

5 (b) a sequence substantially homologous to, or 

capable of hybridising to, a substantial portion of a 
molecule defined in (a) above, 

(c) a fragment of a molecule defined in (a) or (b) 

above . 

10 2. A sequence according to claim 1, wherein the PKDl 

gene has the partial nucleic acid sequence according to 
Figure 7 and/or 10. 

3. A sequence according to claim 1 or claim 2 comprising 

a DNA molecule selected from: 
15 (a) a PKDl gene or its ' complementary strand, 

(b) a sequence substantially homologous to, or 
capable of hybridising to, a substantial portion of a 
molecule defined in (a) above, 

(c) a molecule coding for a polypeptide having the 
20 partial sequence of Figure 7, 

(d) genomic DNA corresponding to a molecule in (a) 
above ; and 

(e) a fragment of a molecule defined in any of (a), 
(b), (c) or (d) above, 

25 4. A nucleic acid sequence comprising a mutant PKDl 

gene, selected from those wherein :- 

(a) [0X114] base pairs 1746-2192 as defined in 
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Figure 7 are deleted (446bp); 

(b) [0X32] base pairs 3696-3831 as defined in 
Figure 7 are deleted by a splicing defect; 

(c) [0X875] about 5.5kb flanked by the two Xbal 
5 sites shown in Figure 3a are deleted and the EcoRl site 

separating the CWIO (41kb) and JHl (IBkb) sites is thereby 
absent; and 

(d) [WS53] about lOOkb extending between the JHl 
and CW21 and the SM6 and JH17 sites shown in Figure 6 and 

10 the PKDl gene is thereby absent. 

5, A nucleic acid sequence comprising a mutant PKDl gene 

selected from those wherein- 

(a) [461] abpout 18bp are deleted in the 75bp 
intron amplified by the primer pair 3A3C insert at position 

15 3696 of the 3* sequence as shown in Figure 11; 

(b) [0X1054] about 20bp are deleted in the 75bp 
intron amplified by the primer pair 3A3C insert at position 
3696 of the 3* sequence as shown in Figure 11; 

(c) [WS212] about 75kb are deleted between SM9-CW9 
20 distally and the PKDl 3^UTR proximally as shown in Figure 

12; 

(d) [WS-215] about 160kb are deleted between CW20 
and CW10-CW36 as shown in Figure 12; 

(e) [WS-227] about 50kb are deleted between CW20 
25 and JHll as shown in Figure 12; 

(f ) [WS-219] about 27kb are deleted between JHl and 
JH6 as shown in Figure 12; and 

(g) [WS-250] about 160kb are deleted betwenn WC20 
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and BLu24 as shown in Figure 12. 

(h) [WS194] a deletion of about 65kb between CW20 and 
CWIO. 

6. An RNA molecule comprising an RNA sequence 
5 corresponding to a DNA sequence according to any of claims 

1 to 5. 

7. An RNA molecule according to claim 6^ wherein the 
molecule is the transcript referenced PKDl and identifiable 
from the restriction map of Figure 3a and having a sequence 

10 of about 14 KB. 

8 . A nucleic acid probe having a sequence according to 
any of the preceding claims and optionally including a 
label . 

9 . A nucleic acid sequence according to any preceding 
15 claim, wherein the nucleic acid sequence encoding PKDl is 

operably linked to transcriptional and/or translational 
expression signals. 

10. An isolated, purified or recombinant polypeptide 
comprising a PKDl protein or a mutant or variant thereof or 

20 encoded by a sequence according to any of claims 1 to 9 or 
a variant thereof having substantially the same activity as 
the PKDl protein. 

11. A polypeptide according to claim 10, wherein the 
PKDl protein has the amino acid sequence according to the 

25 partial amino acid sequence of Figure 7 and/or Figure 10. 

12. An anti-PKDl antibody or a labelled antl-PKDl 
antibody. 

13. A method for screening a subject to determine 
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whether said subject is a PKDl- associated disorder carrier 
or a patient having a PKDl- associated disorder, which method 
comprises detecting the presence of and/or evaluating the 
characteristics of PKDl DNA, PKDl RNA and/or PKDl 
5 polypeptide in a biological sample from said patient. 

14. A method according to claim 13 which is or includes 

detecting and/or evaluating whether the PKDl DNA is mutated, 
deleted, aberrant or otherwise abnormal, or is not 
expressing normal PKDl protein. 
10 15. A method according to claim 13 or claim 14, wherein 

the detection and/or evaluation includes the step of 
comparing the results thereof with results obtained using a 
mutated sequence according to claim 4 or claim 5. 

16. A method according to any of claims 13 to 15, 
15 wherein said screening includes applying a nucleic acid 

amplification process to said sample to amplify a fragment 
of the PKDl DNA or cDNA corresponding to the PKDl RNA. 

17. A method according to claim 16, wherein said nucleic 
acid amplification process uses at least one of the 

20 following sets of primers as identified herein 

AH3 F9 : AH3 B7 
3A3 CI : 3A3 C2 
AH4 F2 : JH14 B3 

18 . A method according to any of claims 13 to 17 which 
25 comprises digesting said sample to EcoRl fragments and 

hybridising with a DNA probe which hybridises to the EcoRl 
fragment identified (A) in Figure 3(a). 

19. A method according to claim 18, wherein said DNA 
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* probe comprises the DNA probe CWIO identified herein. 

20. A method according to any of claims 13 to 17 which 
comprises digesting said sample to provide BamHl fragments 
hybridising with a DNA probe which hybridises to the BamHl 

5 fragment identified (B) in Figure 3(a). 

21. A method according to claim 20, wherein said DNA 
probe comprises the DNA probe IAIH.6 identified herein. 

22. A vector (such as Bluscript (available from 
Stratagene ) ) comprising the nucleic acid sequence of any of 

10 claims 1 to 9. 

23. A host cell (such as E, coli strain SL-1 Blue 
(available from Stratgene)) transfected or transformed with 
a vector according to claim 22. 

24. The use of a vector according to claim 23 or a 
15 nucleic acid sequence according to any of claims 1 to 11 in 

gene therapy and/or in the preparation of an agent for 
treating or preventing a PKDl -associated disorder. 

25. A method of treating or preventing a PKDl- 
associated disorder which method comprises administering to 

20 a patient in need thereof a functional PKDl gene to affected 
cells in a manner that permits expression of PKDl protein 
therein and/or a transcript produced from a mutated 
chromosome such as the deleted WS-212 chromosome which is 
capable of expressing functional PKDl protein therein. 

25 26- A diagnostic kit for carrying out a method according 

to any of claims 13 to 21, comprising nucleic acid primers 
for amplifying a fragment of a sequence according to any of 
Claims 1 to 9. 
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27. A diagnostic kit according to claim 26, wherein the 
nucleic acid primers comprise at least one of the following 
sets: 

AH3 F9 : AH3 B7 
5 3A3 CI : 3A3 C2 

AH4 F2 : JH14 B3 

28. A diagnostic kit for carrying out a method according 
to claim 18, including one or more substances for digesting 
a sample to provide EcoRI fragments and a DNA probe as 

10 defined in claim 19. 

29 . A diagnostic kit for carrying out a method according 
to claim 20, including one or more substances for digesting 
a sample to provide BamHl fragments and a DNA probe as 
defined in claim 21. 

15 30. A diagnostic kit for carrying out a method for 

determining whether said subject is a PKDl-associated 
disorder carrier or a patient having a PKDl-associated 
disorder, which includes a nucleic acid probe capable of 
hybridising to a sequence according to any of claims 1 to 

20 11. 
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1 CTCm:J2f^03MXO^^ 60 

1 LNEEPLTLAGEEIVAQGKRS 20 

61 GAODCXXXSGftGOCTGCTGKCTATGGOG^ 12D 

21 DPRSLLCYGGAPGPGCHFSI 40 

121 OOCGAGOCnTIt^^ 180 

181 GTGG?CTa:aAT ax : mO CX:iTIQGC:TATA 240 

61 VDSNPFPFGYISNYTVSTKV 80 

241 GCXTOGATGGCATTOCAGACACAQGOOGC^^ 300 

81 ASMAFQTQAGAQIPIERLAS 100 

301 GAGaQCI30CATCAOa?IGAAGt?It3COC^^ 360 

101 ERAITVKVPNNSDWAARGHR 120 

361 AGCTCXXXXMCrmSOCAACTOCGTIG^^ 420 

121 SSANSANSVVVQPQASVGAV 140 

421 GTCACXXTOGACAGCAGCMCOCTGaGGCXXS^^ 480 

141 VTLDSSNPAAGLHLQLNYTL 160 

481 CTOSAOGGCCACTACCTCTICIGAGGA^ 540 

161 LDGHYLSEEPEPYLAVYLHS 180 

541 GRGCXXXJQGOCXaATGaGCACAACIGCIC^^ 600 

181 EPRPNEHNCSASRRIRPESL 200 

601 CAG(X^V00IG?0^fiaC03CCC^ 660 

201 QGADHRPYTFFISPGSRDPA 220 

661 GGGAGTTAOCT^TCnGAAOCnCiaiAG^ 720 

221 GSYHLNLSSHFR, WSALQVSW 240 

721 GGCXTOTACAOGTOOCTCIGCXTCTAIOTCA^^ 780 

241 GLYTSLCQYFSEEDMVWRTE 260 

781 GGGCTXTOCarnGGaGGAGAOJiaXXXX^ 840 

261 GLLPLEETSPRQAVCLTRHL 230 

841 ACOGOCTixDosoGOCAGcrrrcrna^ 900 

281 TAFGASLFVPPSHVRFVFPE 300 

901 OI^ACAGaSGATGTAAACrACATOC?!^ 960 

301 PTADVNYIVMLTCAVCLVTY 320 

961 ATGCnCATGGCOQOIATCXnGCACAAGCro^^ 1020 

321 MVMAAILHKLDQLDASRGRA 340 

1021 ATCXXTITCIX?IGC3GCAGa3GQQC^^ 1080 

341 IPFCGQRGRFKYEILVKTGW 360 

1081 GGCra3GGCTCAGC?rADCAa3GOOCyy3C?^^ 1140 

361 GRGSGTTAHVGIMLYGVDSR 380 

1141 AGCGGCTAOIXSGCAarrGGAaGGaSACA^ 1200 

381 SGHRHLDGDRAFHRNSLDIF 400 

1201 CGGAT0QOCXrca3CACAGCXnGGC?rAGOGTC^^ 1260 

401.RIATPHSLGSVWKIRVWHDN -420 

Fiaure 7 
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1261 AAAGGGC:iCAGCIXnT30CTGC?ITaCTOC^^ 1320 

421 KGLCPAWFLQHVIVRDLOTA 440 

1321 CGCPJ303J Jn V L 'lOJYO3VC^ 1380 

441 RSAFFLVNDWLSVETEANGG 460 

1381 crGGTGGftGAAGGAGGTGCTXXraCXI3^^ 140 

461 LVEKEVLAASDAALLRFRRL 480 

1441 croCTTOGCTCADCrGCAGa?!^^ 1500 

481 LVAE LQRGFFDKHIWLSIWD 500 

" 1501 aSGaOGCrroGTAGOOGriTCACOTGCATCXIAGAGGGCr^^ 1560 

501 RPPRSRFTRIQRATCCVLLI 520 

1561 TGOC'ItJilUL i ajGOGOCAAOGCXint?!^ 1620 

521 CLFLGANAVWYGAVGDSAYS 540 

1621 ACGGGGCMGTGTCCAGGCTOAGCCXXrr^^ 1680 

541 TGHVSRLSPLSVDTVAVGLV 560 

1681 TCCAGCC?It33rTGICTATCXCXnCTAO^^ 1740 

561 SSVVVYPVYLAILFI.FRMSR 580 

1741 AGCAAGGIGGCTGGGAGCXXX^AGaSXCftC^^ 1800 

581 SKVAGSPSPTPAGQQVLDID 600 

1801 ASCraXTOGAOrCCJrOXTC^^ 1860 

601 SCLDSSVLDSSFLTFSGLHA 620 

1861 GAGGOCITICTIGGACAGATGAAGAGr^^ 1920 

621 EAFVGQMKSDLFLDDSKSLV 640 

1921 TGCTGGCOCTCOGQOGAGGGAAOXTCTi^^ 1980 

641 CWPSGEGTLSWPDLLSDPSI 660 

1981 GTGGCn'AGCAATCT0OGGCy>kG:TOGCAOGGGQ0CA(^^ 2040 

661 VGSNLRQLARGQAGHGLGPE 680 

2041 a^GGAa^GCITC^CaCTGG0CAGO00CIC^^ 2100 

681 EDGFSLASPYSPAKSFSASD 700 

2101 GAAGAOCrrU^TOCAGGAGCnCCTIGCXXS^^ 2160 

701 EDLIQQVLAEGVSSPAPTQD 720 

2161 AaXACATGGAAAOSCi^CCTOCTCAGCAGC^ 2220 

721 THMETDLLSSLSSTPGEKTE 740 

2221 AaXTOGOCTGCAGAGGCTOGQGGAGCnXSGGGCX?^^ 2280 

741 TLALORLGELGPPSPGLNVJE 760 

2281 CAGOCXXZAGGCAGOGAGGCTCnaiyVGGACAGGA:^^ 2340 

761 QPQAARLSRTGLVEGLRKRL 780 

2341 CmaDGOCXriX3GTGTQGCTOCT03CXX:AaGQ[rTC^^ 2400 

781 LPAWCASLAHGLSLLLVAVA 800 

2401 GIGGCICTCTCAGGGTGGGTGGGIGOGAGC^^ ' 2460 

801 VAVSGWVGASFPPGVSVAWL 820 

2461 CTCIOilAGCAGOGCCAGCrTOri^^ 2520 

821 LSSSASFLASFLGWEPLKVL 840 

Fioure 7 oont'd 
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2521 ciX33AAG0CX:iCTAClTCICA^ 2580 

841 LEALYFSL.VAKRLHPDEDDT 860 

2581 ciOTTAGAGAGOCnSGCrrGTGAaXXTC^ 2640 

861 LVESPAVT.PVSARVPRVRPP 880 

2541 CAOSQCOTKXACTCTICCOTGaCAAOGAAG 2700 

881 HGFALF LAKEEARKVKRLHG 900 

2701 ATOC:rGCGGfiOCCi:C^^ 2760 

901 MLRSLLVYMLFLLVTLLASY 920 

2761 GGGGftlGOCTCATGO^^TOGCrACQ^ 2820 

921 GDASCHGHAYRLQSAIKQEL 940 

2821 CACAGO33330CTVCCT(^^ 2880 

941 HSRAFLAITRSEELWPWMAH 960 

2881 GIGCKXnX9CXCTAa?IXXAD^^ 2940 

961 VLLPYVHGNQSSPELGPPRL 980 

2941 CGGCaGC?K30GCXTOCAGGAAGCRCOT 3000 

981 RQVRLQEALYPDPPGPRVHT 1000 

3001 TGCTCGGOCI^CAGGAGGCTrCAGCAaCAGOGATrAa^ 3060 

1001 CSAAGGFSTSDYDVGWESPH 1020 

3061 AATOXOTGGGGAOGriGQGCCTAriCAGCXmX^ 3120 

1021 NGSGTWAYSAPDLLGAWSWG 1040 

3121 TCCIUnxri?ICTATGACAGOGG^ 3180 

1041 SCAVYDSGGYVQELGLSLEE 1060 

3181 AGOaXX3AaDC3GCrKXXX:iTOCT^^ 3240 

1061 SRDRLRFLQLHNV7LDNRSRA 1080 

3241 GTOITCCTOGAGCrCACGCXCTAC^^ 3300 

1081 VFLELTRYSPAVGLHAAVT L 1100 

3301 C(XXjra^P£JrJXXXXXXX2GC^^ 3360 

1101 RLEFPAAGRALAALSVRPrA 1120 

3361 CV0C:O0:S3CCTCI>J3^^ 3420 

1121 LRRLSAGLSLPLLTSVCLLL 1140 

3421 Tia30CXnxX3CTICXXXX?nXXXr^ 34S0 

1141 FAVHFAVAEARTWHREGRVJR 1160 

3481 GTGCTOOGGCTOQGAGCXrrGGGaXI^GIGGCT^^ 3540 

1161 VLRLGAWARWLLVALTAATA 1180 

3541 cTCGTAax£na3oa:y\Gcna3Gn3^^ 36ao 

1181 LVRLAQLGAADRQWTRFVRG 1200 

3601 OGCCOGCXXX]QCTICACTAGCITOGAa:^^ 3660 

1201 RPRRFTSFDQVAHVSSAARG 1220 

3661 CTGGOSGCxrrugiari ci ' ia J i G L ' ii ' ii GGrc^^ 3720 

1221 LAASLLFLLLVKAAQHVRFV 1240 

3721 CGOCMlTGCTiaCTOriTOGCAA 3780 

1241 RQWSVFGKTLCRALPELLGV 1260 

Fiorurs 7 cxtnt^d 
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3781 
1261 

3841 
1281 

3901 
1301 

3961 
1321 

4021 
1341 

4081 
1361 

4141 
1381 

4201 
1401 

4261 
1421 

4321 
1441 

4381 
1451 

4441 
1481 

4501 
1501 

4561 
1521 

4621 

1541 

4681 
1561 

4741 
1581 

4801 
1601 

4861 

4921 

4981 

5041 



TLGLVVLGVAYAQLAI LLVS 
TCCTCnCTGGACiaXTCIQGftGOCT^^ 

SCVDSLWSVAQALLVLCPGT 
GQGCTClCTAOaiTOrcTOCTQCXXftGI^^ 

GLSTLCPAESWHLSPLLCVG 
CICraSQCIACia^GGCTGTGGGGaGOa^ 

LWALRLWGALRLGAVILRWR 

TAoaoGocnaDcnaaGaQCTgrACOQQO^ 

YHALRGELYRPAWEPQDYEM 

VELFLRRLRLWMGLSKVKEF 
CGCXZACAAAGlOCGCnOTGAAGGGATGGAGCXaXn^^ 

RHKVRFEGMEPLPSRSSRGS 




TCCTCCAGCCAGCTGasLTGGSCIGAGOC^^ 

SSSQLDGLSVSLGRLGTRCE 

PEPSRLQAVFEALLTQFDRL 
AAOiiTVGGOCACAGAGGAajICTADCAGCT^^ 

NQATEDVYQLEQQLHSLQGR 
AGGAGCAGCOGGGOGaXXXXDGG.\TCTTCXrGIG^^ 

RSSRAPAGSSRGPSPGLRPA 



LPSRLARASRGVDLATGPSR 



ACACrTra3GQ0CAAGVCA;i.GCntX:ACXXrAGCAQC^ 

TPSGQEQGPPOQHLVLLPGG 
GC?IXXX3CCX?IGGAGra3GAGTGGAC^ 

GGPWSRSGHRSVLLSAAVKA 
GAGGGOIlAGGCAGA^VTGCSCTGCACGTAGGn^^ 

EGQAEWLHVGSPE S.R Q G H L S 
grCTGTGGGClTCAGCXnTrAAAGAGQCI^^ 

VCGLQHFKEAVWPTRTQGPL 

COIAGCTOOCTIGGGAAGGyCACAGCAGTA^^ 
PSSLGKDTAVLDGF 

TITATITC0CXDGAgiCCIX:fiG3TACAQGQG 



gICOCa:yCT3CTAAGGCTrnX3G^^ 



CaXn-AACOTATTAOCICItrAGTIOCrAO^ 
TCGICICAGTAATITATATGGTGTrAAAATGICT 
Figure 7 Cont'd 
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5101 r?333C:TGM3033CCT^^ 5160 

5161 TOia?ixjLXjL;iTATqgcjL^^ 5220 

5221 cmGGGGCACAGOGICKXXy^GCr:ft^^ 5280 

5281 TOXCIMGCXIXGCTrAGCAAGAGft^^ 5340 

5341 CTAlXAGGACrAGGCATCnCAGftGGftOOOCAGCX^^ 5400 

5401 GGCCIgXTOOCAQGgiaSAGSAAGCm;^^ 5450 

5461 GCX^ftCTGTOCrcrATQGOCCAGGC^^ 5520 

5521 rGTGrPCCPL:nCiVlXX3O0A^^ 5580 

5581 AOCAAGCAGACAAACnCAATAAAAGAGCTGTC^^ 5631 



1A1H0,6 

1 AAGCTTGSCA OCATCAAGGG CCAGTICAAC TITCrOCADG TGATCGICAC CmaCTOGAC 

61 TAOGAGTrGCA AaJIOTOIC GC'lGGftGIO: AGGAAftGACA ODGAQGGCCr TCIGSACAOC 

121 AGOOraoaCA AGA'ia&TGlC T3?m3CAAC CIGOOCnaG TGGOOOGOCA (^TGCOOCIG 

181 CAOQCAAATA TOQCCICACA GCTOCATCAT KSOOGCIOCA AOOOCADOGA TAICTAOOX 

241 TOCAAl^rGGA TTQOXQQCT CTOXACATC AAGOOQCTOC G0CAG0Q3AT CK30GAQSAA 

301 GOOQCCTACT IXAACGCCAG CCTAGCICTS GIQCAOOCTC OnOOCATAG CAAAGGOOCT 

361 GCACAGACrC CAOXEAQaC CACACCTGCC TATGA3C?ia3 G0CAf30QGAA GOQOCICATC 

421 TCrTOQGTGG AGGftCITCAC aSAGITIGrG TGAQQOOGGG GOOCICOCTC CIt3CACIQ3C 

481 CrTOGftOCT ATTGC3CIGIC AGTIGAAATAA ATAAAGICCT GAOGOCACTTG CACAGACATA 

541 GAGGCACAGA nO: 

Fiquie 8 
WCIOF 

1 GTOOGOGGTC GCACGTAOr TICIt3GIGIG TGIQOOGT GOGGGQCTGG GAAGICTTGG 

61 CAGAaSGOSA CTTACGIOCIC ACltX-TiTlG TICnTIGAC CTAAGCIGGC GAGIGGC^rT 

121 GCTGAUriXX; GGTCAGKXi: CXXCCTCATG TQGGAfXOX GIGC^.TICIT GinCTTAGGT 

181 GGIGSOGGPG TG 

CWIOR 

1 AGGCAGGflCr OGOOGAOSAG CAGCSGGAGftG GCAOOCAAGS T 
Fioure 9 
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(xi) SEQUENCE DESCREPTION: SEQ ID NO: 1: (Caipane Fig.l) 

C GGC GOC GOC TQC CCC GTC AAC TGC TCG GGC OQC GOG CTG OGG ADG 46 
Gly Ala Ala Cys Arg Val Asn O/s Ser Gly Arg Gly Leu Arg Thr 
15 10 15 

CTCGGTOXGCXSCnGCGCATCCXrGOGGftCGOCACAGaJCTAGACC^ 94 
Leu Gly Pro Ala Leu Arg lie Pro Ala Asp Ala Thr Ala Leu Asp Val 

20 25 30 

TCC CAC AAC CTG CIC CGGGCX3CrGGACC?ITGGGCTCCrGGC33AACCrC 142 
Ser His Asn Leu Leu Arg Ala Leu Asp Val Gly Leu Leu Ala Asn Leu 
35 40 45 

TCG GOG CTG GCA GAG CTG GAT ATA AGC AAC AAC AAG ATT TCT AOG 1TA 190 
Ser Ala Leu Ala Glu Leu Asp lie Ser Asn Asn Lys lie Ser Thr Leu 
50 55 60 

GPA GAA GGA ATA TTT GCT AAT TTA TTT AAT TTA AGT GAA ATA AAC CTG 238 
Glu Glu Gly He Phe Ala Asn Lsu Phe Asn Leu Ser Glu He Asn Leu 
65 70 75 

AGT G3G AAC OCG TTT GAG TGT GAC TGT GGC CTG GOG TGG CTG OCG CX3A 236 
Ser Gly Asn Pro Phe Glu Cvs Asp Cys Gly Leu Ala Trp Leu Pro Arg 
80 85 " 90 95 

TGG GCG GAG GAG CAG CAG GIG CGG GTG GTG CAG OOC GAG GCA GCC ACG 334 
Trp Ala Glu Glu Gin Gin Val Arg Val Val Gin Pro Glu Ala Ala Thr 

100 105 110 

TGT GCr GGG CCT GGC TCC CIG GOT GGC CAG OCT CTG CTT GGC ATC OJZ 332 
Cys Ala Gly Pro Glv Ser Leu Ala Gly Gin Pro Leu Leu Gly lie Pro 
115 * 120 125 

TTG CTG GAC AGT GGC TGT GGT GAG GAG TAT GTC GOC TGC CTC OCT GAC 430 
Leu Leu AsD Ser Gly Cys Gly Glu Glu Val Ala Cys Leu Pro Asp 
130 135 140 

AAC AGC TCA GGC ACC GTG GCA GCA GTG TCC TIT TCA GCT GOC CAC GAA. <78 
Asn Ser Ser Gly Thr Val Ala Ala Val Ser Phe Ser Ala Ala His Glu 
145 150 155 

GGC CTG CTT CAG OCA GAG GCC TGC AGC GOC TIC TGC TIC TOC ACC GGC 5^6 
Gly Leu Leu Gin Pro Glu Ala Cys Ser Ala Phe cys Phe Ser Thr Gly 
160 165 170 175 

CAG GGC CTC GCA GOC CTC TCG GAG CAG GGC TGG TGC CTG TCTT GGG GCG 574 
Gin Glv Leu Ala Ala Leu Ser Glu Gin Gly Trp Cys Leu Cys Gly Ala 

180 185 190 

GOCCAG0CCTCCAC?rG0CTOCTrrG0CTGCCTGTOCCrCTGCT0CGOC 622 
Ala Gin Pro Ser Ser Ala Ser Phe Ala Cys Leu Ser Leu Cys Ser Gly 
195 200 205 

OOC OCG OCA OCT OCT GCC 000 ACC TGT AGG GGC OOC AOC CTC CTC CAG 670 
Pro Pro Pro Pro Pro Ala Pro Thr Cys Arg Gly Pro Thr Leu Leu Gin 
210 215 220 

CAC GIC TTC OCT GOC TCC OCA GGG GOC AOC CTG GTG GGG 000 CAC GGA 718 
His Val Phe Pro Ala Ser Pro Glv Ala Thr Leu Val Gly Pro His Gly 
225 230 235' 
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OCT CIG GCC TCT G3C CAG CPA GCA GOC TIC CAC ATC OCT GOC COG CIC 766 
Pro Leu Ala Ser Gly Gin Lbu Ala Ala Pha His He Ala Ala Pro Leu 
240 245 250 255 

CX:r GTC ACT GOC ACA CX3C TGG GftC TTC GGA GAC GC3C TOC GOC GAG GTG 814 
Pro Val Thr Ala Itir Arg Trp Asp Fhe Gly Asp Gly Ser Ala Glu Val 

260 265 270 

GAT GOC GCT GOG CXX3 GCT GOC TCX5 CAT OGC TAT GIG CIG OCT GGG OGC 862 
Asp Ala Ala Gly Pro Ala Ala Ser His Arg Tyr Val Leu Pro Gly Arg 
275 280 285 

TAT CAC CTTG AOG GCC GIG CTG GOC CIG GOG GOC GGC TCA GOC CTG CIG 910 
Tyr His Val Thr Ala Val Leu Ala Leu Gly Ala Gly Ser Ala Leu Leu 
290 295 300 

GGG ACA GAC GIG CAG GIG GAA GOG GCA CCT OX GOC CTG GAG CTC GTG 958 
Gly TtYC Asp Val Gin Val Glu Ala Ala Pro Ala Ala Leu Glu Leu Val 
305 310 315 

TGC COG TOC TOG GTG CAG AGT GAC GAG AGO CTT GAC CTC AGC ATC CAG 1006 
Cys Pro Ser Ser Val Gin Ser Asp Glu Ser Leu Asp Leu Ser He Gin 
320 325 330 335 

AAC OGC GGT GGT TCA GGC CIG GAG GOC GOC TAC AGC ATC GTG GOC CTG 1054 
Asn Arg Gly Gly Ser Gly Leu Glu Ala Ala Tyr Ser He Val Ala Leu 

340 345 350 

GGC GAG GAG COG GOC OGA' GOG GTG CAC OOG CTC TGC COC TOG GAC AOG 1102 
Gly Glu Glu Pro Ala Arg Ala Val His Pro Leu Cys Pro Ser Asp Thr 
355 360 365 

GAG ATC TTC OCT GGC AAC GGG CAC TGC TAC OGC CTG GIG GTG GAG AAG 1150 
Glu He Phe Pro Gly Asn Gly His Cys Tyr Arg Leu Val Val Glu Lys 
370 375 380 

GOG GOC TGG CIG CAG GOG CAG GAG CAG TGT CAG GOC TGG GOC GGG GOC 1198 
Ala Ala Trp Leu Gin Ala Gin Glu Gin Cys Gin Ala Trp Ala Gly Ala 
385 390 395 

GOC CTG GCA ATG GTG GAC ACT COC GCC GTG CAG CGC TIC CTG GTC TCC 1246 
Ala Leu Ala Met Val Asp Ser Pro Ala Val Gin Arg Pha Leu Val Ser 
400 405 410 415 

OGG CTC AOC AGG AGC CTA GAC GIG TGG ATC GGC TTC TOG ACT GTG CAG 1294 
Arg Val Thr Arg Ser Leu Asp Val Trp He Gly Phe Ser Thr Val Gin 

420 425 430 

GGG GTG GAG GTG GOC OCA GOG OOG CAG GGC GAG GOC TTC AGO CTG GAG 1342 
Gly Val Glu Val Gly Pro Ala Pro Gin Gly Glu Ala Phe Ser Leu Glu 
435 440 445 

AGCTGCCAGAACTQGCTGOOCGGGGAGOCACACOCAQOC ACA GOC GAG 1390 
Ser Ofs Gin Asn Trp Leu Pro Gly Glu Pro His Pro Ala Thr Ala Glu 
450 455 460 

CAC TGC GTC OGG CTC GGG 000 AGO GGG TGG TCT AAC AOC GAC CTG TGC 1438 
His Cys Val Arg Leu Gly Pro Thr Gly Trp Cys Asn Thr Asp Leu Cys 
465 470 475 



SUBSTITUTE SHEET (RULE 26) 



wo 95/18225 



PCT/GB94/02822 



14/58 

TCA GOG 0CX3 CAC AGC TAG GTC TQC GAG CTG GAG OOC GGA GGC OCA GTG 1486 
Ser Ala Pro His Ser Tyr Val Cys Glu Leu Gin Pro Gly Gly Pro Val 
480 485 490 495 

GAG GAT GOC GAG AAC CPC CTC GTG GGA GOG COC AC?r GGG GAC CTG GAG 1534 
Gin Asp Ala Glu Asn Leu Leu Val Gly Ala Pro Ser Gly Asp Leu Gin 

500 505 510 

GGA CCC CTG AOG OCT CTG GCA CAG CAG GAC GGC CTC TCA GOC COG CAC 1582 
Gly Pro Leu Thr Pro Leu Ala Gin Gin Asp Gly Leu Ser Ala Pro His 
515 520 



GAG OOC GTG GAG CJTC ATG CJTA TIC OOG GGC CTG OCT CTG AGC OCT GAA 1630 
Glu Pro Val Glu Val Met Val Phe Pro Gly Leu Arg Leu Ser Arg Glu 
530 535 540 

GOC TTC CTC AOC ACG GOC GAA TTT GGG AOC CAG GAG- CTC OGG OGG OOC 1678 
Ala Phe Leu Thr Thr Ala Glu Phe Gly Thr Gin Glu Leu Arg Arg Pro 
545 550 555 

GCCCAGCTGOGGCTGCAGGPGTACOGGCTCCICAGCACAGCAGQGAOC 1726 
Ala Gin Lsu Arg Leu Gin Val Tyr Arg Leu Leu Ser Thr Ala Gly Thr 
560 555 570 575 

COG GAG AAC GGC AGO GAG OCT GAG AGC AGG TOO OOG GAC AAC AGS AOC 1774 
Pro Glu Asn Gly Ser Glu Pro Glu Ser Arg Ser Pro Asp Asn Arg Thr 

580 585 590 

CAG CTG 000 OOC GOG TGO ATG OCA GGG GGA CGC TGG TGC OCT GGA GOC 1822 
Gin Leu Ala Pro Ala Ntet Pro Gly Gly Arg Txp Cys Pro Gly Ala 
595 500 605 

AAC ATC TGC TTC OOG CTG GAC GOC TCT TGC CAC OOC CAG GOC TOC GOC 1870 
Asn lie Cys Leu Pro Leu Asp Ala Ser Cys His Pro Gin Ala Cys Ala 
610 615 620 

AAT GGC TGC AOG TCA GGG CCA GGG CTA OOC GGG GOC OOC TAT GOG CTA 1918 
Asn Gly Cys Thr Ser Gly Pro Gly Leu Pro Gly Ala Pro Tyr Ala Leu 
625 630 635 

TGGAGAGAGTTCCTCTlCTCCGrrGOCGOGGGGCOCOaCGOGCAGTAC 1966 
Tip Arg Glu Phe Leu Phe Ser Val Ala Ala Gly Pro Pro Ala Gin Tyr 
640 645 650 655 

TOG CTC AOC CTC CAC GGC CAG GAT GTC CTC ATG CTC OCT GCT'GAC CTC 2014 
Ser Val Thr Leu His Gly Gin Asp Val Leu Met Leu Pro Gly Asp Leu 

660 665 670 

GTTGGCTTCCAGCACGACGCTGGCOCTGGCGOCCICCrGCACTGCTCG 2062 
Val Gly Leu Gin His Asp Ala Gly Pro Gly Ala Leu Leu flis Cys Ser 
675 680 685 

OOGGCTCOCGGCCAC0CTOCTaCX:CAGG0C00GTACCICT0CGa:AAC 2110 
Pro Ala Pro Gly His Pro Gly Pro Gin Ala Pro Tyr Leu Ser Ala Asn 
690 695 700 

» 

GOCTOGTCATGGCrGOOCCACTrGOCAGOCCAGCrGGAGGGCACTTCG 2158 
Ala Ser Ser Trp Leu Pro His Leu Pro Ala Gin Leu Glu Gly Thr Trp 
705 710 715 
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GOC TOC OCT GOC TGT OCC CTG 03G CTG CTT OCA GOC ACG GAA CAG CTC 2206 
Ala Cys Pro Ala Cys Ala Leu Arg Lau Leu Ala Ala Thr Glu Gin Leu 
720 725 730 735 

AOC GIG CTG CTG GGC TTCAGGOXAACOCTGGACrGOGGATGCCrGGG 2254 
Thr Val Leu Leu Gly Leu Arg Rra Asn Pro Gly Leu Arg Met Pro Gly 

740 745 750 

03C: TAT GAG GTC 030 GCA GAG (JIG GGC AAT GGC GTG TOC AGG CAC AAC 2302 
Arg Tyr Glu Val Arg Ala Glu Val Gly Asn Gly Val Ser Arg His Asn 
755 760 765 

CrC TOC TGC AGC TIT GAC GIG CTTC TOC OCA GIG OCT GOG CTG OGG GTC 2350 
Leu Ser Cys Ser Kie Asp Val Val Ser Pro Val Ala Gly Leu Arg Val 
770 775 780 

ATC TAG OCT GOC COC GGC GAC GGC OGC CTC TAG GTG COC AOC AAC GGC 2398 
lie Tyx Pro Ala Pro Arg Asp Gly Arg Leu Tyr Val Pro Thr Asn Gly 
785 790 795 

TCA GCC TIG GIG CTC CAG GIG GAC TCT GGT GCC AAC GOC AOG GOC AOG 2446 
Ser Ala Leu Val Leu Gin Val Asp Ser Gly Ala Asn Ala Thr Ala Thr 
800 805 810 815 

GOT OGC TGG OCT GGG GGC AGT GIC AGC GOC OGC TTT GAG AAT GTC TGC 2494 
Ala Arg Trp Pro Gly Gly Ser Val Ser Ala Arg Phe Glu Asn Val Cys 

820 825 830 

COT GCC CTG GTG GOC AOC TTC GTG COC GGC TGC OCC TGG GAG ACC AAC 2542 
Pro Ala Leu Val Ala Thr Phe Val Pro Gly Cys Pro Trp Glu Thr Asn 
835 840 845 

GAT AOC CTG TTC TCA GTG GTA GCA CTG COG TGG CTC AGT GAG GGG GAG 2590 
Asp Thr Leu Phe Ser Val Val Ala Leu Pro Trp Leu Ser Glu Gly Glu 
850 855 860 

CAC GTG GTG GAC GIG GTG GTG GAA AAC AGC GGC AGC OGG GOC AAC CTC 2633 
His Val Val Asp Val Val Val Glu Asn Ser Ala Ser Arg Ala Asn Leu 
865 870 875 

AGC CTG OGG GIG AOG GOG GAG GAG OCC ATC TGT GGC CTC OGC GCC ACG 2686 
Ser Leu Arg Val Thr Ala Glu Glu Pro lie Cys Gly Leu Axg Ala Thr 
880 885 890 895 

OCC AGC COC GAG GOC OGT GTA CTG CAG GGA GIC CTA GIG AGG TAG AGC 2734 
Pro Ser Pro Glu Ala Arg Val Leu Gin Gly Val Leu Val Arg lyr Ser 

900 905 910 

COCGTGGTGGAGGOCGQCTOGGACATGGTCTTCOGGTGGAOCATC AAC 2782 
Pro Val Val Glu Ala Gly Ser Asp Met Val Phe Arg Trp Thr lie Asn 
915 920 925 

GAC AAG CAG TOC CTG ACC TTC CAG AAC GTG GTC TTC AAT GIC ATT TAT 2S30 
Asp Lys Gin Ser Leu Thr Phe Gin Asn Val Val Phe Asn Val He Tyr 
930 935 940 

CAG AGC GOG GOG GTC TTC AAG CIC TCA CTG AjOG GCC TOC AAC CAC GTG 2878 
Gin Ser Ala Ala Val Phe Lys 'Leu Ser Leu Thr Ala Ser Asn His Val 
945 950 955 
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AGC AAC GTC AOC GIG AAC TAG AAC GTA AOC GTG GAG OGG ATG AAC AGG 2926 
Ser Asn Val Thr Val Asn Tyr Asn Val Thr Val Glu Arg Mst Asn Arg 
960 965 970 975 

ATC GAG GGT GIG CAG GTC TOC ACA GIG OOG GOC GIG CTG TOC CCC AAT 2974 
htet Gin Gly Leu Gin Val Ser Thr Val Pro Ala Val Leu Ser Pro Asn 

980 985 990 

CCC ACA .CTG GTA CTG AOG GGT GGT GIG CTG GIG GAC TCA GCT GTG GAG 3022 
Ala Thr Leu Val Leu Thr Gly Gly Val Leu Val Asp Ser Ala Val Glu 
995 1000 1005 

GIG GOC TIC CTG TGG AAC TTT GGG GAT GGG GAG GAG GOC CTC CAC GAG 3070 
Val Ala Phe Leu Trp Asn Phe Gly Asp Gly Glu Gin Ala Leu His Gin 
1010 1015 1020 

TIC CAG OCT COG TAG AAC GAG TOC TIC OOG GIT OCA GAC OOC TCG GTG 3118 
Phe Gin Pro Pro Tyr Asn Glu Ser Phe Pro Val Pro Asp Pro Ser Val 
1025 1030 1035 

GOC CAG GIG CTG GIG GAG CAC AAT GTC ATG CAC AOC TAC GCT GOC OCA 3166 
Ala Gin Val Leu Val Glu His Asn Val Ntet His Thr Tyr Ala Ala Pro 
1040 1045 1050 1055 

GGT GAG TAC CTC CTG ADC GIG CTG GCA TCT AAT GOC TIC GAG AAC CTG 3214 
Gly Glu Tyr Leu Ijeu Thr Val Leu Ala Ser Asn Ala Phe Glu Asn Leu 

1060 1065 1070 

AOG CAG CAG GIG OCT GIG AGC GIG OGC GOC TOC CTG OCC TOC GIG GCT 3262 
Thr Gin Gin Val Pro Val Ser Val Arg Ala Ser Leu Pro Ser Val Ala 
1075 1080 1085 

GIG GGT GTG ACT GAC GGC GIC CTG GIG GOC GGC OGG CCC GTC ACC TIC 3310 
Val Gly Val Ser Asp Gly Val Leu Val Ala Gly Arg Pro Val Tnr Phs 
1090 1095 1100 

TAC OOG CAC OOG CIG OCC TOG OCT GGG GGT GTT CTT TAC AOG TGG GAC 3358 
oyr Pro His Pro Leu Pro Ser Pro Gly Gly Val Leu Tyr Thr Trp Asp 
1105 1110 1115 

TTC GGG GAC GGC TOC OCT GTC CTG ADC CAG AGC CAG COG GCT GCC AAC 3406 
Phe Gly Asp Gly Ser Pro Val Leu Thr Gin Ser Gin Pro Ala Ala Asn 
1120 1125 1130 1135 

CAC AOC TAT GCC TOG AGG GGC AOC TAC CAC GTG OGC CTG GAG GIC AAC 3454 
His Thr Tyr Ala Ser Arg Gly Thr Tyr His Val Arg Leu Glu Val Asn 

1140 1145 1150 

AAC AOG GIG AGC GCT GOG GOG GOC CAG GOG GAT GTG OGC GIC TTT GAG 3502 
Asn Thr Val Ser Gly Ala Ala Ala Gin Ala Asp Val Arg Val Phe Glu 
1155 1160 1165 

GAG CTC OGC GGA CTC AGC GIG GAC ATG AGC CTG GOC GTG GAG CAG GGC 3550 
Glu Leu Arg Gly Leu Ser Val Asp Met Ser Leu Ala Val Glu Gin Gly 
1170 1175 1180 

GOC OOC GTG GTG GIC AGC GOCGOGGIGCAGADGGQCGACAACATCAOG 3598 
Ala Pro Val Val Val Ser Ala 'Ala Val Gin Thr Gly Asp Asn He Thr 
1185 1190 1195 
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TGG AOC TTC GftC ATG GC3G GAC GGC ACC GTG CTG TOG GGC CCG GAG GCA 3646 
Trp Thr Pte Asp Met Gly Asp Gly Ohr Val Leu Ser Gly Pro Glu Ala 
1200 1205 1210 1215 

ACA GIG GAG CAT GTG TAG CTG COG GCA CAG AAC TGC ACA GTG AOC GTG 3694 
Thr Val Glu His Val Tyr Leu Arg Ala Gin Asn Cys Itir Val Thr Val 

1220 1225 1230 

GGT'GOG GOC AGC OOC GCC GGC CAC CTG GOC OGG AGC CTG CAC GIG CTG 3742 
Gly Ala Ala Ser Fro Ala Gly His Leu Ala Arg Ser Leu His Val Leu 
1235 1240 1245 

GTC TIC G?rC CTG GAG GIG CTG OGC GIT GAA OCC GOC GOC TGC ATC COC 3790 
Val Phe Val Leu Glu Val Leu foog Val Glu Pro Ala Ala Cys lie Pro 
1250 1255 1260 

AOS CAG OCT GAC GOG OGG CIC AOG GCC TAG GTTC AOC GGG AAC OCG GOC 3838 
Thr Gin Pro Asp Ala Arg Leu Thr Ala Tyr Val Thr Gly Asn Pro Ala 
1265 1270 1275 

CAC TAG CTC TIC GAC TGG AOC TIC GGG GAT GGC TOO TOC AAC AOG AOC 3886 
His Tyr Leu Phe Asp Trp Thr Phe Gly Asp Gly Ser Ser Asn Thr Thr 
1280 1285 1290 1295 

GTG OGG GGG TGC COG AOG GTG ACA CAC AAC TIC AOG OGG AGC GGC AOG 3934 
Val Arg Gly Cys Pro Thr Val Thr His Asn Phe Thr Arg Ser Gly Thr 

1300 1305 1310 

TIC OOC CTG GOG CTG GIG CTG TOC AGC OGC GIG AAC AGG GOG CAT TAC 3982 
Phe Pro Leu Ala Leu Val Leu Ser Ser Arg Val Asn Arg Ala His Tyr 
1315 1320 1325 

TIC AOC AGC ATC TGC GTG GAG OCA GAG GTG GGC AAC GIC AOC CTG CAG 4030 
Phe Thr Ser lie Cys Val Glu Pro Glu Val Gly Asn Val Thr Leu Gin 
1330 1335 1340 

OCA GAG AiSG CAG TTT GTG CAG CTC GGG GAC GAG GOC TGG CTG GTG C3:::a 4078 
Pro Glu Arg Gin Phe Val Gin Leu Gly Asp Glu Ala Trp Leu Val Ala 
1345 1350 1355 

TGT GCC TGG OOC OOG TIC OOC TAC GGC TAC ACC TGG GAC TIT GGC AOC 4126 
Cys Ala Trp Pro Pro Pte Pro Tyr Arg Tyr Thr Trp Asp Phe Gly Thr 
1360 1365 T.370 1375 

GAG GAA GCC GCC OOC ADC OCT GOC AGG GGC OCT GAG GTG ACG 1TC ATC 4174 
Glu Glu Ala Ala Pro Thr Arg Ala Arg Gly Pro Glu Val Thr Phe lie 

1380 1385 1390 

TAC OGA GAC OCA GGC TOC TAT CTT GIG ACA GIC AOC GOG TOC AAC AAC 4222 
Tyr Arg Asp Pro Gly Ser Tyr Leu Val Thr Val Thr Ala Ser Asn Asn 
1395 1400 1405 

ATC TCr GOT GOC AAT GAC TCA GX CIG GIG GAG GTG CAG GAG OOC GTG 4270 
lie Ser Ala Ala Asn Asp Ser Ala Leu Val Glu Val Gin Glu Rro Val 
1410 1415 1420 

CTG GIC AOC AGC ATC AAG GTC AAT GGC TOC CTT GGG CIG GAG CTG CAG . 4318 

Leu Val Thr Ser lie Lys Val Asn Gly Ser Leu Gly Leu Glu Leu Gin 
1425 1430 1435 
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CAG COG TAG CTG TIC TCT OCT GTS 03Z 03T GGG 03C CCC GOC AGC TAC 4366 
Gin Pro Tyr Leu Ite Ser Ala Val Gly Arg Gly Arg Pro Ala Ser lyr 
1440 1445 1450 1455 

CTG TGG GAT CTG GOG GAC GCTT GGG TGG CIC GAG GGT COG GAG GTC AOC 4414 
Leu Trp Asp Leu Gly Asp Gly Gly Trp Leu Glu Gly Pro Glu Val Thr 

1460 1465 1470 

CAC GCr TAC AAC AGC ACA GGT GAC TIC ACC GIT AGG GIG GOC GGC TGG 4462 
His Ala Tyr Asn Ser Thr Gly Asp Phe Itir Val Arg Val Ala Gly Trp 
1475 1480 1485 

AAT GAG GTG AGC OGC AGC GAG GOC TOG CIC AAT GTG AOG GTG AAG OGG 4510 
Asn Glu Val Ser. Arg Ser Glu Ala Trp Leu Asn Val Thr Val Lys Arg 
1490 1495 1500 

OGC GIG OGG GGG CIC GTC C?rC AAT GCA AGC OGC AOG GTG GTG GOC CTG 4558 
Arg Val Arg Gly Leu Val Val Asn Ala Ser Arg Thr Val Val Pro Leu 
1305 1510 1515 

AAT GGG AGC GTG AGC TTCAGCAOGTCGCrGGAGGOCGGCAGTGATGTG 4606 
Asn Gly Ser Val Ser Phe Ser Thr Ser Leu Glu Ala Gly Ser Asp Val 
1520 1525 1530 1535 

OGC TAT TOC TGG GTG CTC TGT GAC OGC TGC AOG OOC ATC OCT GGG GGT 4654 
Arg Tyr Ser Trp Val Leu Cys Asp Arg Cys Thr Pro He Pro Gly Gly 

1540 1545 1550 

OCT AOC ATC TCT TAC AOC TTC OGC TOO GTG GGC AOC TTC AAT ATC ATC 4702 
Pro Thr He Ser *Tyr Thr Phe Arg Ser Val Gly Ttxc Phe Asn He He 
1555 1560 1565 

GIC AOG GCT GAG AAC GAG GTG GGC TOC GOC CAG GAC AGC ATC TTC GTC 4750 
Val Thr Ala Glu Asn Glu Val Gly Ser Ala Gin Asp Ser He Phe Val 
1570 1575 1580 

TAT GTC CTG CAG CTC ATA GAG GGG CTG CAG GIG GTG GGC GGT GGC OGC 4798 
Tyr Val Leu Gin Leu He Glu Gly Leu Gin Val Val Gly Gly Gly Arg 
1585 1590 1595 

TAC TTC OOC AOC AAC CAC AOG CTA CAG CTG CAG GOC GTG GIT AGG GAT 4846 
Tyr Phe Pro Thr Asn His Thr Val Gin Leu Gin Ala Val Val Arg Asp 
1600 1605 1610 1615 

GGC AOC AAC GTC TOC TAC AGC TGG ACT GGC TGG AGG GAC AGG GGC COG 4894 
Gly Thr Asn Val Ser Tyr Ser Trp Thr Ala Trp Arg Asp Arg Gly Pro 

1620 1625 1630 

GCC CTG GOC GGC AGC GGC AAA GGC TIC TOG CTC AOC GTG CIC GAG GOC 4942 
Ala Leu Ala Gly Ser Gly Lys Gly Pte Ser Leu Thr Val Leu Glu Ala 
1635 1640 1645 

GGC AOC TAC CAT GTG CAG CTG OGG GOC AOC AAC ATG CTG GGC AGC GOC 4990 
Gly Thr Tyr His Val Gin Leu Arg Ala Thr Asn Met Leu Gly Ssr Ala 
1650 1655 1660 

TGG GCC GAC TGC AOC ATG GAC TIC GTG GAG OCT GTG GGG TGG CTG ATG 5038 
Trp Ala Asp Cys Thr Met Asp *Phe Val Glu Pro Val Gly Trp Leu Met 
1665 1670 1675 
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CTIGACCGOCTOCOOGAACOCAGCTGCCGriCAACACAAD^ GTC AOC CTC 5086 
Val Thr Ala Ser Pro Asn Pro Ala Ala Val Asn rtsr Ser Val Thr Leu 
16B0 1685 1690 1695 

AGT GOC GAG CTG OCT QC?r GQC AOT GGT GTC GTk TAC ACT TGG TOC TIG 5134 
Ser Ala Glu Leu Ala Gly Gly Ser Gly Val Val 1^^^ Ihr Trp Ser Leu 

1700 1705 1710 

GAG GAG GGG CTG AGC TGG GAG PCC TOC GAG CCA TIT ACC AOC CAT AGC 5182 
Glu Glu Gly Leu Ser Tcp Glu Thr Ser Glu Pro Phe Thr Thr His Ser 
1715 1720 1725 

TlCOOCACACOCGQCCrGCfiCTTGGIC ACC ATG AOG GCA GOG AAC OCG 5230 
Phe Pro Thr Pro Gly Leu His Leu Val Thr Met Thr Ala Gly Asn Pro 
1730 1735 1740 

CrGGGCTCAGOCAACGCCACCGTGGAAGrGGATGrGCAGGTGOCr GTG 5278' 
Leu Gly Ser Ala Asn Ala Itir Val Glu Val Asp Val Gin Val Pro Val 
1745 1750 1755 

AC?r GGC CIC AGC ATC AOG GOC AGC GAG CCC GGA GGC AGC TTC GTG GCG 5326 
Ser Gly Leu Ser He Arg Ala Ser Glu Pro Gly Gly Ser Phe Val Ala 
1760 1765 1770 1775 

GOC GGG TCC TCP GTG 000 TTT TGG GGG GAG CTG GOC AOG GGC AOC AAT 5374 
Ala Gly Ser Ser Val Pro P^^ Trp Gly Gin Leu Ala Thr Gly Thr Asn 

1780 1785 1790 

4 

GTG AGC TGG TGC TGG GOT GIG OOC GOC GOC AOC AGC AAG COT GGC OCT 5422 
Val Ser Trp Cys Trp Ala Val Pro Gly Gly Ser Ser Lys Arg Gly Pro 
1795 1800 1805 

CAT GTC AOC ATG GTC TIC COG GAT OCT GGC AOC TTC TOC ATC CG3 CTC 5470 
His Val Thr r^t Val Phe Pro Asp Ala Gly Thr Phe Ser He Arg Leu 
1810 1815 1820 

AAT GCC TOC AAC GCA GTC AGC TGG GTC TCA GOC AOG TAC AAC CTC AOG 5518 
Asn Ala Ser Asn Ala Val Ser Trp Val Ser Ala Thr Tyr Asn Leu Thr 
1825 1830 1835 

GOG GAG GAG OOC ATC GTG GGC CTG GIG CTG TOG GOC AGC AGC AAG GTG 5566 
Ala Glu Glu Pro He Val Gly Leu Val Leu Trp Ala Ser Ser Lys Val 
1840 1845 1850 1855 

GTG GCG OOC GGG CAG CTG GTC CAT TTT CAG ATC CTG CTG GOT GOC GGC 5614 
Val Ala Pro Gly Gin Leu Val His Phe Gin He Lsu Leu Ala Ala Gly 

1860 1865 1870 

TCA GOT GTC AOC TTC CGC CTG CAG GTC GGC GGG GOC AAC OOC GAG GTG 5662 
Ser Ala Val Thr Phe Arg Leu Gin Val Gly Gly Ala Asn Pro Glu Val 
1875 1880 1885 

CTC OOC GGG OOC OCT TTC TOC CAC AGC TTC OOC 000 GTC GGA GAC CAC 5710 
Leu Pro Gly Pro Arg Phe Ser His Ser Phe Pro Arg Val Gly Asd His 
1890 1895 1900 

GTG GTG AGC CTG OGG GGC AAA -AAC CAC GIG AGC TGG GOC CAG GCG CAG 5758 
Val Val Ser Val Arg Gly Lys Asn His Val Ser Trp Ala Gin Ala Gin 
1905 1910 1915 
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GIG 0C3C ATC CJTG GIG CIG GAG GOC GIG AGT GGG CIG CAG ATG COC AAC 5806 
Val Arg lie Val Val Leu Glu Ala Val Ser Gly Leu Gin Met Pro Asn 
1920 1925 1930 1935 

TGC TGC GftG CTT GGC ATC GCC AOG 03Z ACT GftG AGG AAC TTC ACA GCC 5854 
Cys Cys Glu Pro Gly lie Ala Thr Gly Thr Glu Arg Asn Phe Thr Ala 

1940 1945 1950 

CGCGTGCAGCXSCGGCTCTOQGGrCGOCTACGOCTGGTACTTC ICC CIG 5902 
Arg Val Gin Arg Gly Ser Arg Val Ala Tyr Ala Trp Tyr Phe Ser Leu 
1955 1960 1965 

CAG AAG GTC CAG GGC GAC T0C3 CIG GTC ATC CIG TOG GGC CGC GAC GTC 5950 
Gin Lys Val Gin Gly Asp Ser Leu Val lie Leu Ser Gly Arg Asp Val 
1970 1975 1980 

AOC TAG AOG CCC GIG GCC GOG GGG CIG TIG GAG ATC CAG GTG OOC GCC • 5998 
Thr TVr Thr Pro Val Ala Ala Gly Leu Leu Glu He Gin Val Arg Ala 
1985 1990 1995 

TTC AAC GCC CIG GGC AGT GAG AAC OGC ACG CIG GIG CTG GAG GIT CAG 6046 
Phe Asn Ala Leu Gly Ser Glu Asn Arg Thr Leu Val Lsu Glu Val Gin 
2000 2005 2010 2015 

GAC GCC GTC CAG TAT GIG GCC CIG CAG AGO GOC CCC TGC TTC AOC AAC 6094 
Asp Ala Val Gin Tyr Val Ala Leu Gin Ser Gly Pro Cys Phe Thr Asn 

2020 2025 2030 

OGC TOG GOG CAG TIT GAG GOC GOC AOC AGC COC AGO OOC OGG OCT GIG 6142 
Arg Ser Ala Gin Phe Glu Ala Ala Thr Ser Pro Ser Pro Arg Arg Val 
2035 2040 2045 

GOC TAC CAC TGG GAC TIT GGG GAT GOG TOG OCA GGG CAG GAC ACA GAT 6190 
Ala Tyr His Trp Asp Phe Gly Asp Gly Ser Pro Gly Gin Asp Thr Asp 
2050 2055 2060 

GAG 000 AGG GOC GAG CAC TOC TAC CTG AGG OCT GGG GAC TAC CGC GTG 6238 
Glu Pro Arg Ala Glu His Ser Tyr Leu Arg Pro Gly Asp Tyr Arg Val 
2065 2070 2075 

CAG GIG AAC GOC TOC AAC CTG GIG AGC TIC TIC GIG GOG CAG GOC ACG 6286 
Gin Val Asn Ala Ser Asn Leu Val Ser Phe Phe Val Ala Gin Ala Thr 
2080 2085 2090 2095 

GTG AOC GIC CAG GTG CIG GOC TGC OGG GAG OOG GAG GTG GAC GIG GTC 6334 
Val Thr Val Gin Val Leu Ala Cys Arg Glu Pro Glu Val Asp Val Val 

2100 2105 2110 

CTG OOC CIG CAG GIG CIG ATG GGG OGA TCA CAG OGC AAC TAC TIG GAG 6382 
Leu Pro Leu Gin Val leu Met Arg Arg Ser Gin Arg Asn Tyr Leu Glu 
2115 2120 2125 

GOC CAC GIT GAC CTG OGC GAC TGC GTC AOC TAC CAG ACT GAG TAC CGC 6430 
Ala His Val Asp Leu Arg Asp Cys Val Thr Tyr Gin Thr Glu Tyr Arg 
2130 2135 2140 

TGG GAG GIG TAT OGC AOC GOC AGC TGC CAG OGG OCG GGG OGC OCA GOG ■ 6478 
Trp Glu Val Tyr Arg Thr Ala Ser Cys Gin Arg Pro Gly Arg Pro Ala 
2145 2150 2155 
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CGI GTG GOC CTG OOC GC3C C?IG GAC GTG AGC OGG OCT CGG CTG C?rG CIG 6526 
Arg Val Ala Leu Pro Gly Val Asp Val Ser Arg Pro Arg Leu Val Leu 
2160 2165 2170 2175 

COG OGG CIG GCG CIG OCT GTG GGG CAC TAG TGC TTT GTG TIT GTC C?IG 6574 
Pro Arg Leu Ala Leu Pro Val Gly His Tyr Cys Phe Val Phe Val Val 

2180 2185 2190 

TCA TIT GGG GAC AOG OCA CIG ACA CAG AGO ATC CAG GOC AAT GIG AOG 6622 
Ser Phe Gly Asp Ttar Pro Leu Utr Gin Ser lie Gin Ala Asn Val llir 
2195 2200 2205 

C?IG GOC OOC GAG OGC CIG GIG OOC ATC ATT GAG GGT GGC TCA TAC OGC 6670 
Val Ala Pro Glu Arg Leu Val Pro lie He Glu Gly Gly Ser Tyr Arg 
2210 2215 2220 

C?IG TOG TCA GAC ACA OGG GAC CIG GTG CIG GAT GGG AGC GAG TOO TAC 6718 
Val Trp Ser Asp Thr Arg Asp Leu Val Leu Asp Gly Ser Glu Ser Tyr 
2225 2230 2235 

GAC CCC AAC CTG GAG GAC GGC GAC CAG ACG COG CIC AC?r TTC CAC TGG 6766 
Asp Pro Asn Leu Glu Asp Gly Asp Gin Thr Pro Leu Ser Phe His Trp 
2240 2245 2250 2255 

GOC TGT GIG GCT TOG ACA CAG AGS GAG GOT GGC GGG TGT GCG CIG AAC 6814 
Ala Cys Val Ala Ser Thr Gin Arg Glu Ala Gly Gly Cys Ala Leu Asn 

2260 2255 2270 

TIT GGG COC OGC GGG AGO AGC AOG GIC AOC ATT CCA CGG GAG OGG CIG 6862 
Phe Gly Pro Arg Gly Ser Ssr Thr Val Thr He Pro Arg Glu Arg Leu 
2275 2280 2285 

GOG GCr GGC GIG GAG TAC AOC TTC AGC CTG ADC GIG TGG AAG GOC GGC 6910 
Ala Ala Gly Val Glu Tyr Thr Phe Ser Leu Thr Val TTp Lys Ala Gly 
2290 2295 2300 

OGC AAG GAG GAG GOC AGC AAC CAG AOG GTG CIG ATC OGG AGT GGC OGG 6958 
iKTg Lys Glu Glu Ala Thr Asn Gin Thr Val Leu He Arg Ser Gly Arg 
2305 2310 2315 

GIG OOC ATT GIG TOO TIG GAG TGT GIG TOO TGC AAG GCA CAG GGC GIG 7006 
Val Pro He Val Ser Leu Glu Cys Val Ser Cys Lys Ala Gin Ala Val 
2320 2325 2330 2335 

TAC GAA GIG AGC OGC AGC TOC TAC GIG TAC TIG GAG GGC OGC TOO CIC 7054 
Tyr Glu Val Ser Arg Ser Ser Tyr Val Tyr Leu Glu Gly Arg cys Leu 

2340 2345 2350 

AAT TGC AGC AGC GGC TOC AAG OGA GGG OGG TGG GCT GCA OGT AOG TIC 7102 
Asn Cys Ser Ser Gly Ser Lys Arg Gly Arg Trp Ala Ala Arg Thr Pte 
2355 2360 2365 

AGC AAC AAG AOG CTG GIG CIG GAT GAG AOC AOC ACA TOO AOG GGC AGT 7150 
Ser Asn Lys Thr Leu Val Leu Asp Glu Tnr Thr Tnr Ser Thr Gly Ser 
2370 2375 2380 

GCA GGC ATG OGA CIG GIG CIG OGG OGG GGC GIG CTG OGG GAC GGC GAG 7198 
Ala Gly Met Arg Leu Val Leu Arg Arg Gly Val Leu Arg Asp Gly Glu 
2385 2390 2395 
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GGA TAC AOC TTC AOG CTC PCG GTG CTG GGC CX3C TCP GGC GAG GAG GAG 7246 
Gly Tyr Tto H--£i Thr lau Thr Val Leu Gly Arg Ser Gly Glu Glu Glu 
2400 2405 2410 2415 

GGC TGC GOC TOC ATC COC CTG TCC CX3C AAC OGC COG COG CTG GGG GGC 7294 
Gly Cys Ala Ser lie Arg Leu Ser Pro Asn Arg Pro Pro Leu Gly Gly 

2420 2425 2430 

OCT TGC OQC CIC TIC CCA CTG GGC GCT CTTG CAC GOC CTC ADC ADC AAG 7342 
Ser Cys Arg Leu Phe Pro Leu Gly Ala Val His Ala Leu Thr Thr lys 
2435 2440 2445 

GrGCACTICGAAT(X:Aa3GQCTGGCATGtfCG0GGAGGATGCTGGC GOC 7390 
Val His Fte Glu Cys Thr Gly Trp His Asp Ala Glu Asp Ala Gly Ala 
2450 2455 2460 

COG CTG GTG TAG GOC CTG CIG CTG OGG OGC TGT OGC CAG GGC CAC TGC 7438 
Pro Leu Val Tyr Ala Leu Leu Leu Arg Arg Cys Arg Gin Gly His Cys 
2455 2470 2475 

GAG GAG TTC TGT GTC TAG AAG GGC AGC CTC TCC AGO TAC GGA GOC GTG 7486 
Glu Glu Phs Cys Val *ryr Lys Gly Ser Leu Ser Ser Tyr Gly Ala Val 
2480 2485 2490 2495 

CTG COC COG GC?r TIC AGG CCA CAC TTC GAG GTG GGC CTG GOC GTG GIG 7534 
Leu Pro Pro Gly Pl^ Arg Pro His Phe Glu Val Gly Leu Ala Val Val 

2500 2505 2510 

GTG CAG GAC CAG CTG GGA GCC OCT GIG CTC GOC CTC AAC AGG TCT TTG 7582 
Val Gin Asp Gin Leu Gly Ala Ala Val Val Ala Leu Asn Aig Ser Leu 
2515 2520 2525 

GOC ATC AOC CIC CCA GAG OGC AAC GGC AGC GCA ACG GGG CIC ACA GTC 7630 
Ala lie Thr Leu Pro Glu Pro Asn Gly Ser Ala Thr Gly Leu Thr Val 
2530 2535 2540 

TGG CTG CAC GGG CIC ADC GOT AGP GIG CTC CCA GGG CTG CTG OGG CAG 7678 
Trp Leu His Gly Leu Thr Ala Ser Val Leu Pro Gly Leu Leu Arg Gin 
2545 2550 2555 

GOC GAT 000 CAG CAC GIC ATC GAG TAC TOG TTG GOC CIG GIC ADC GTG 7726 
Ala Asp Pro Gin His Val lie Glu Tyr Ser Leu Ala Leu Val ito Val 
2560 2565 2570 2575 

CTG AAC GAG TAC GAG OGG GOC CIG GAC GIG GOG GCA GAG COC AAG CAC 7774 
Leu Asn Glu Tyr Glu Arg Ala Leu Asp Val Ala Ala Glu Fxo Lys His 

2580 2585 2590 

GAG OGG CAG CAC OGA GOC CAG ATA OGG AAG AAC ATC AOG GAG ACT CTG 7822 
Glu Arg Gin His Arg Ala Gin lie Arg Lys Asn lie Thr Glu Thr Leu 
2595 2600 2605 

GIG TCC CTG AGG GTC CAC ACT GTG GAT GAC ATC CAG CAG ATC GOT GCT 7870 
Val Ser Leu Arg Val His Thr Val Asp Asp lie Gin Gin lie Ala Ala 
2610 2615 2620 

GOG CTG GOC CAG TGC ATG GGG CDC AGC AGG GAG CTC GTA TGC OGC TOG 7918 
Ala Leu Ala Gin Cys Met Gly Pro Ser Arg Glu Leu Val Cys Arg Ser 
2625 2630 2635 
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TQC CTG AAG CAG AOS CTG CAC AAG CTG GAG GOC ATG ATG CIC ATC CTG 7966 
Cys Leu lys Gin Thr Leu His Lys Leu Glu Ala Met Met Leu lie Leu 
2640 2645 2650 2655 

CAG GCA GAG ACC AOC GOG GGC AOC GIG AOG COC ACT GOC ATC GGA GAC 8014 
Gin Ala Glu Tte Thr Ala Gly Thr Val Thr Pro Thr Ala He Gly Asp 

2660 2665 2670 

AGC ATC CIC AAC ATC ACA GC5A GAC CTC ATC CAC CTG GCC AGC TOG GAC 8062 
Ser He Leu Asn He Thr Gly Asp Leu He His Leu Ala Ser Ser Asp 
2675 2680 2685 

GIG OQG GCA OCA CAG COC TCA GAG CTG GGA GOC GAG TCA OCA TCT OGG 8110 
Val Arg Ala Pro Gin Pro Ser Glu Leu Gly Ala Glu Ser Pro Ser Arg 
. 2690 2695 2700 

ATG GIG OOG TOC CAG GOC TAC AAC CTG AOC TCT GOC CTC ATG CGC ATC 8158 
Met Val Ala Ser Gin Ala Tyr Asn Leu Thr Ser Ala Leu Met Arg He 
2705 2710 2715 

CTC ATG CGC TOC CGC GIG CTC AAC GAG GAG COC CTG AOG CTG GOG GGC 8206 
Leu Met Arg Ser Arg Val Leu Asn Glu Glu Pro Leu Thr Leu Ala Gly 
2720 2725 2730 2735 

GAG GAG ATC GTG GOC CAG GGC AAG CGC TOG GAC OOG OGG AGC CTG CTG 8254 
Glu Glu He Val Ala Gin Gly Lys Arg Ser Asp Pro Arg Ser Leu Leu 

2740 2745 2750 

TGC TAT GGC GGC GOC OCA GGG OCT GGC TGC CAC TIC TOC ATC OOC GAG 8302 
Cys lyr Gly Gly Ala Pro Gly Pro Gly Cys His Phe Ser He Pro Glu 
2755 2760 2765 

car TTC AGC GGG GCC CTG GOC AAC CTC AGT GAC GTG GTG CAG CTC ATC 8350 
Ala Phe Ser Gly Ala Leu Ala Asn Leu Ser Asp Val Val Gin Leu He 
2770 2775 2780 

TIT CTG GTG GAC TOC AAT OOC TIT COC TTT GGC TAT ATC AGC AAC TAC 8398 
Phe Leu Val Asp Ser Asn Pro Phe Pro Phe Gly Tyr He Ser Asn Tvr 
2785 2790 2795 

AOC GTC TOC AOC AAG GTG OOC TOG ATG GCA TTC CAG ACA CAG GOC GGC 8446 
Thr Val Ser Thr Lys Val Ala Ser Met Ala Fte Gin Thr Gin Ala Gly 
2800 2805 2810 2815 

GOC CAG ATC OOC ATC GAG OGG CTG GOC TCA GAG CGC GOC ATC AOC GIG 8494 
Ala Gin He Pro He Glu Arg Leu Ala Ser Glu Arg Ala He Thr Val 

2820 2825 2830 

AAG GTG OOC AAC AAC TOG GAC TOG GCT GOC OGG GGC CAC CGC AGC TOC 8542 
Lys Val Pro Asn Asn Ser Asp Trp Ala Ala Arg Gly His Arg Ser Ser 
2835 2840 2845 

GOC AAC TOC GOC AAC TOC GTIT GTG GTC CAG COC CAG GOC TOC GTC GCT 8590 
Ala Asn Ser Ala Asn Ser Val Val Val Gin Pro Gin Ala Ser Val Gly 
2850 2855 2860 

GCT GTG GTC AOC CTG GAC AGC -AGO AAC OCT OOG GOC GGG GIG CAT CTG 8638 
Ala Val Val Thr Leu Asp Ser Ser Asn Pro Ala Ala Gly leu His Leu 
2865 2870 2875 
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CAG CIC AAC TAT ADG CTG CIG GAC GGC CAC TAG CTG TCT GAG GAA OCT 8686 
Gin Leu Asn oyr Thr Leu Lea .*-3p Gly His Tyr Leu Ser Glu Glu Pro 
2880 2885 2890 2895 

GAG CCC TAG CTG GCA GTC TAG CTA CAC TOG GAG 000 OGG 000 AAT GAG 8734 
Glu Pro Tyr Leu Ala Val T^r Leu His Ser Glu Pro Arg Pro Asn Glu 

2900 2905 2910 

CAC AAC TGC TOG GCT AGO AGG AGG ATC OGC OCA GAG TCA CIC CAG GGT 8782 
His Asn Cys Ser Ala Ser Arg Arg He Arg Pro Glu Ser Leu Gin Gly 
2915 2920 2925 

GCT GAC CAC OGG OOC TAC AOC TIC TIC ATT TOO COG GGG AGC AGA GAC 8830 
Ala Asp His Arg Pro T^^ Thr Phe Phe He Ser Pro Gly Ser Arg Asp 
2930 2935 2940 

OCA GOG GGG ACT TAC CAT CTG AAC CTC TOG AGC CAC TIC OGC TGG TOG 8878 
Pro Ala Gly Ser Tyr His Leu Asn Leu Ser Ser His Phe Arg Trp Ser 
2945 2950 2955 

GOG CTG CAG GTG TOC GIG GC3C CTG TAC ACG TOC CTG TGC CAG TAC TTC 8926 
Ala Leu Gin Val Ser Val Gly Leu Tyr Thr Ser Leu Cys Gin Tyr Phs 
2960 2965 2970 2975 

AGC GAG GAG GAC ATG GTG TGG OGG ACA GAG GGG CTG CTG 000 CTG GAG 8974 
Ser Glu Glu Asp Met Val Trp Arg Thr Glu Gly Leu Leu Pro Leu Glu 

2980 2985 2990 

GAG AOC TOG OOC OGC CAG OOC GTC TGC CTC ACC OGC CAC CIC AOC GOC 9022 
Glu Thr Ser Pro Arg Gin Ala Val Cys Leu Thr Arg His Leu Thr Ala 
2995 3000 3005 

TTC GGC GOO AGO CTC TTC CTG OOC OCA AOC CAT GTC OOC TIT CTG TTT 9070 
Phe Gly Ala Ser Leu Phe Val Pro Pro Ser His Val Arg Phe Val Phe 
3010 3015 3020 

OCT GAG COG ACA GOG GAT CTA AAC TAC ATC GTC ATG CTG ACA TCT GCT 9118 
Pro Glu Pro Thr Ala Asp Val Asn Tyr He Val Met Leu Thr Cys Ala 
3025 3030 3035 

GIG TGC CTG GIG AOC TAC ATG GTC ATG GOC GOC ATC CIG CAC AAG CTG 9156 
Val Cys Leu Val Thr Tyr Met Val Met Ala Ala He Leu His Lys Leu 
3040 3045 3050 3055 

GACCAGTIGGATQOCAGCOGGGGCCGCGOCATCOCTTrCTCT GGG CAG 9214 
Asp Gin Leu Asp Ala Ser Arg Gly Arg Ala He Pro Phe Cys Gly Gin 

3060 3065 3070 

OGG GOC OGC TTC AAG TAC GAG ATC CTC GTC AAG ACA GGC TGG GGC OGG 9262 
Arg Gly Arg Phe Lys lyr Glu He Leu Val Lys Thr Gly Trp Gly Arg 
3075 3080 3085 

GGC TCA GGT AOC AOG GOC CAC GTG GGC ATC ATG CTG TAT GGG GIG GAC 9310 
Gly Ser Gly Thr Thr Ala His Val Gly He Nfet Leu Tyr Gly Val Asp 
3090 3095 3100 

AGCCGGAGCGGCCACOOGCACCTGGACGGCGACAGAGOCTTC CAC OGC 9358 
Ser Arg Ser Gly His Arg His Leu Asp Gly Asp Arg Ala Phe His Arg 
3105 3110 3115 
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AAC AGC CTG GAC ATC TTC OGG ATC QOC AOC CCG CAC AGC CTG GGT AGC 9406 
Asn Ser Leu Asp lie Pte Arg lie .^»la Thr Pro His Ser Leu Gly Ser 
3120 3125 3130 3135 

GIG TGG AAG ATC OGA GIG TGG CAC GAC AAC AAA GSG CTC AGC OCT GCC 9454 
Val Trp Lys He Arg Val lip His Asp Asn Lys Gly Leu Ser Pro Ala 

3140 3145 3150 

TGG TTC CTG CftG CAC GTC ATC C?IC AGG GAC CTG CAG AOG GCA GGC AGC 9502 
Trp Fhe Leu Gin His Val He Val Ancr Asp Leu Gin Thr Ala Arg Ser 
3155 3liS0 3165 

GCC TIC TTC CTG GIC AAT GftC TGG CTT TOG GTG GAG AOG GAG GCC AAC 9550 
Ala Phe Phe Leu Val Asn Asp Trp Leu Ser Val Glu Thr Glu Ala Asn 
3170 3175 3180 

GGG GGC CTG GTG GAG AAG GAG GTTG CTG GCC GOG AGC GAC GCA GCC CIT 9598 
Gly Gly Leu Val Glu Lys Glu Val Leu Ala Ala Ser Asp Ala Ala Leu 
3185 3190 3195 

TTG OGC TTC OGG CGC CTG CTG CTG GCT GAG CTG CAG OCT GGC TIC TIT 9646 
Leu Arg Phe Arg Arg Leu Leu Val Ala Glu Leu Gin Arg Gly Phe Phe 
3200 3205 3210 3215 

GAC AAG CAC ATC TGG CTC TOO ATA TGG GAC OGG OOG OCT OCT AGC OCT 9694 
Asp Lys His He Trp Leu Ser He Trp Asp Arg Pro Pro Arg Ser Arg 

3220 3225 3230 

TTC ACT OGC ATC CAG AGG GOO AOC TGC TGC GTT CTC CIC ATC TOO CTC 9742 
Phe Thr Arg He Gin Arg Ala Thr Cys Cys Val Leu Leu He Cys Leu 
3235 3240 3245 

TTC CTG GGC GOC AAC GOO GIG TGG TAC GGG GCT CTT GGC GAC TCT GOC 9790 
Phe Leu Gly Ala Asn Ala Val Trp Tyr Gly Ala Val Gly Asp Ser Ala 
3250 3255 3260 

TAC AGC AOG GGG CAT GTG TOO AGG CTG AGC OOG CTG AOC GTC GAC ACA 9838 
Tyr Ser Thr Gly His Val Ser Arg Leu Ser Pro Leu Ser Val Asp Thr 
3265 3270 3275 

GTC GCT GIT GGC CTG GTG TOO AGC GIG GTT GTC TAT COO GIC TAjC CTG 9886 
Val Ala Val Gly Leu Val Ser Ser Val Val Val Tyr Pro Val Tyr Leu 
3280 3285 3290 3295 

GOC ATC CTT TIT CTC TTC OGG ATG TOO OGG AGC AAG GIG GCT GGG AGC 9934 
Ala He Leu Phe Leu Phe Arg -Met Ser Arg Ser Lys Val Ala Gly Ser 

3300 3305 3310 

OOG AGC OOC ACA OCT GOC GGG CAG CAG GTG CTG GAC ATC GAC AGC TGC 9982 
Pro Ser Pro Thr Pro Ala Gly Gin Gin Val Leu Asp He Asp Ser Cys 
3315 3320 3325 



CTG GAC TOG TOO GIG CTG 
Leu Asp Ser Ser Val Leu 
3330 

CAC GCT GAG GOC TTT GTT 
His Ala Glu Ala Phe Val 
3345 



GAC AGC TOO TIC CTC AOG 
Asp Ser Ser Phe Leu Thr 
3335 
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TTC TOA GGC CTC 10030 

Phe Ser Gly Leu 
3340 



GGA CAG ATG AAG ACT GAC TIG TIT CTG GAT 10078 
Gly Gin Met Lys Ser Asp Leu Pte Leu Asp 
3350 3355 
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GAT TCT AAG AGT CIG GIG TQC TGG CXT TOC GGC GAG GGA AOG CTC AGT 10126 
Asp Ser Lys Ser Leu Val Cys Trp Pro Ser Gly Glu Gly Thr Leu Ser 
3360 3365 3370 3375 

TGG CCG GAC CTG CTC AGT GftC CXX3 TOC ATT GTG GGT AGC AAT CTG CX3G 10174 
Trp Pro Asp Leu Leu Ser Asp Pro Ser He Val Gly Ser Asn Leu Arg 

3380 3385 3390 

GAG CTG GCA OGG GGC GAG GOG GGC CAT G3G CTG GGC OCA GAG GAG GAC 10222 
Gin Leu Ala Arg Gly Gin Ala Gly His Gly Leu Glv Pro Glu Glu Asp 
3395 3400 3405 

GGC TTC TOC CIG GOC AGC OOC TAC TOG OCT GCC AAA TOO TTC TCA GCA 10270 
Gly Phe Ser Leu Ala Ser Pro Tyr Ser Pro Ala Lys Ser Phe Ssr Ala 
3410 3415 3420 

TCA GAT GAA GAC CTG ATC CAG CAG GTC dT GOC GAG GGG GTC AGC AGC 10318 
Ser Asp Glu Asp Leu He Gin Gin Val Leu Ala Glu Gly Val Ser Ser 
3425 3430 3435 

OCA GOC OCT ACC CAA GAC ADC CAC ATG GAA ACG GAC CTG CTC AGC AGC 10366 
Pro Ala Pro Thr Gin Asp Thr His Met Glu Thr Asp Leu Leu Ser Ser 
3440 3445 3450 3455 

CTG TOC AGC ACT OCT GGG GAG AAG ACA GAG AOG CTG GOG CTG CAG AGG 10414 
Leu Ser Ser Thr Pro Gly Glu Lys Thr Glu Thr Leu Ala Leu Gin Arg 

3460 3465 3470 

CTG GGG GAG CTG GGG OCA OOC AGC OCA GGC CTG AAC TGG GAA CAG OOC 10462 
Leu Gly Glu Leu Gly Pro Pro Ser Pro Gly Leu Asn Trp Glu Gin Pro 
3475 3480 34S5 

CAG GCA GOG AOG CTG TCC AGG ACA GGA CTG GIG GAG GGT CTG OGG AAG 10510 
Gin Ala Ala Arg Leu Ser Arg Thr Gly Leu Val Glu Gly Leu Arg Lys 
3490 3495 3500 



CGC CTG CTG OOG GOC TGG TGT GOC TOC CTG GOC CAC GGG CTC AGC CTG 
Arg Leu Leu Pro Ala Trp Cys Ala Ser Leu Ala His Gly Leu Ser Leu 
3505 3510 3515 



10558 



CTC CTG GTG OCT GTG GCT GTG GCT GTC TCA GGG TGG GTG GGT GOG AGC 10606 
Leu Leu Val Ala Val Ala Val Ala Val Ser Gly Trp Val Gly Ala Ser 
3520 3525 3530 3535 

TTC OOC OOG GGC GTG AGT GIT GOG TGG CTC CIG TOC AGC AGC GOC AGC 10654 
Phe Pro Pro Gly Val Ser Val Ala Trp Leu Leu Ser Ser Ser Ala Ser 

3540 3545 3550 

TTC CTG GOC TCA TIC CTC GGC TGG GAG OCA CTG AAG GTC TIG CTG GAA 10702 
Phe Leu Ala Ser Phe Leu Gly Trp Glu Pro Leu Lys Val Leu Leu Glu 
3555 3560 3565 

GOC CTG TAC TTC TCA CTG GIG OOC AAG OGG CTG CAC OOG GAT GAA GAT 10750 
Ala Leu Tyr Phe Ser Leu Val Ala Lys Arg Lsu His Pro Asp Glu Asp 
3570 3575 3580 

GAC AOC CTG GTA GAG AGC COG GCT GTG AOG OCT GTG AGC GCA CGT GIG 10798 
Asp Thr Leu Val Glu Ser Pro Ala Val Thr Pro Val Ser Ala Arg Val 
3585 3590 3595 
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CXX: CCC GTA OGG OCA CXX: CAC GGC TTT GCA CIC TTC CTG OOC AAG GAA 10846 
Pro Aig Val Arg Pro Pro His Gly Phe Ala Leu Phei tau Ala Lys Glu 
3600 3605 3610 3615 

GAA GOC OGC AAG C?IC AAG AGG CTA CAT GGC ATG CTG CGG AGC CTC CTG 10894 
Glu Ala Arg Lys Val Lys Arg Leu His Gly ^fet Leu Arg Ser Leu Leu 

3620 3625 3630 

GTG TAC ATG CTT TTT CTG CTG GIG AOC CTG CTG GCC AGC TAT GOG GAT 10942 
Val Tyr Met Leu Phe Leu leu Val Ihr Leu Leu Ala Ser_ Tyx Gly Asp 
3635 3640 3645 

GCC TCA TGC CAT GGG CAC GCC TAC CDT CTG CAA AGC GDC ATC AAG CAG 10990 
Ala Ser Cys His Gly His Ala Tyr Arg Leu Gin Ser Ala He Lys Gin 
3650 3655 3660 

GAG CTG CAC AGC COG GCC TTC CTG GCC ATC ACG OGG TCT GAG GAG CTC 11038 
Glu Leu His Ser Arg Ala Phe Leu Ala He Ihr Arg Ser Glu Glu Leu 
3665 3670 3675 

TGG CCA TGG ATG GCC CAC GIG CTG CIG OOC TAC GTC CAC GGG AAC CAG 11086 
Trp Pro Trp Met Ala His Val Leu Leu Pro Tyr Val His Gly Asn Gin 
3680 3685 3690 3695 

TCC AGC OCA GAG CTG GGG OOC OCA OGG CTG OGG CAG GIG OGG CTG CAG 11134 
Ser Ser Pro Glu Leu Gly Pro Pro Arg Leu Axg Gin Val Arg Leu Gin 

3700 3705 3710 

GAA GCA CTC TAC OCA GAC OCT OOC GGC OOC AGG GTC CAC AOG TGC TOG 11182 
Glu Ala Leu Tyr Pro Asp Pro Pro Gly Pro Arg Val His Thr Cys Ser 
3715 3720 3725 

GCC GCA GGA GGC TTC AOC ACC AGC GAT TAC GAC GIT GGC TGG GAG POT 11230 
Ala Ala Gly Gly Phs Ser Thr Ser Asp Tyr Asp Val Gly Trp Glu Ser 
3730 3735 3740 

CCr CAC AAT GGC TOG GG3 AOG TGG GOC TAT TCA GOG COG GAT CVG CTG 11278 
Pro His Asn Gly Ser Gly Thr Trp Ala Tyr Ser Ala Pro Asp Lbu Leu 
3745 3750 3755 



GGG GCA TGG TCC TGG GGC TCC TGT GCC GTTG TAT GAC AGC GGG GGC TAC 11326 
Gly Ala Trp Ser Trp Gly Ser Cys Ala Val Tyr Asp Ser Gly Gly Tyr 
3760 3765 3770 3775 



GTG 


CAG 


GAG 


CTG GGC CTG AGC CTG 


GAG GAG 


AGC OGC 


GAC 


OGG 


CTG OGC 


11374 


Val 


Gin 


Glu 


Leu Gly Leu Ser Leu 


Glu Glu 


Ser Arg 


Asp 


Axg 


Leu Arg 










3780 


3785 




3790 




TIC 


CTG 


CAG 


CTG CAC AAC TGG CIG 


GAC AAC 


AGG AGC 


OGC 


GCT 


GTTG TTC 


11422 


Phe 


Leu 


Gin 


Leu His Asn Trp Leu 


Asp Asn 


Arg Ser 


Arg 


Ala 


Val Phe 










3795 


3800 




3805 




CTG 


GAG 


CTC 


AOG OGC TAC AGC COG 


GOC GIG 


GGG CTG 


CAC 


GCC 


GCC GTC 


11470 


Leu 


Glu 


Leu 


Thr Arg Tyr Ser Pro 


Ala Val 


Gly Leu 


His 


Ala 


Ala Val 








3810 3815 


3820 







AOG CTG OGC CTC GAG TTC COG GOG GOCGGCCGCGCCCTGGOCGCCCTC 11518 
Thr Leu Arg Leu Glu Phe Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu 
3825 3830 3835 
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AGC C?rc OGC OOC TTT GOG CTG CGC CCC CTC AGC 003 GGC CIC TOG CTG 11566 
Ser Val Arg Pro PtG Ala Lau Arg Arg l£u Ser Ala Gly Leu Ser Leu 
3S40 3845 3850 3855 

OCT CTG CIC AOC TCG GTS TGC CTG CTG CTG TIC GOC GIG CAC TIC GOC 11614 
Pro Lbu Leu Ttir Ser Val Leu Leu Leu Phe Ala Val His Phe Ala 

3860 3865 3870 

GTG GOC GAG GOC CGT ACT TGG CAC AGG GAA GGG OGC TGG OGC C?IG CTG 11662 
Val Ala Glu Ala Arg Rir Trp His Arg Glu Gly Arg Trp Arg Val Leu 
3875 3880 3885 

CGG CTC GGA GOC TGG GOG OGG TGG CTG CTG GTG GOG CTG AOG GOG GOC 11710 
Arg Leu Gly Ala Trp Ala Arg Trp Leu Leu Val Ala Leu Thr Ala Ala 
3890 3895 3900 

AOG GCA CTG GTA OGC CTC GOC CAG CTG GGT GOC GCT GAC OGC CAG TGG 11758 
Thr Ala Leu Val Arg Leu Ala Gin Leu Gly Ala Ala Asp Arg Gin Trp 
3905 3910 3915 

AOC OGT TIC GTG CGC GGC OGC COG OGC OGC TIC ACT AGC TIC GAC CAG 11806 
Thr Arg Phe Val Arg Gly Arg Pro Arg Arg Phe Thr Ser Phe Asp Gin 
3920 3925 3930 3935 

GIG GOG CAC GTG AGC TOO GCA GOC OGT GGC CIG GOG GOO TOG CTG CIC 11854 
Val Ala His Val Ser Ser Ala Ala Arg Gly Leu Ala Ala Ser Leu Leu 

3940 3945 3950 

TIC CIG CTT TIG QIC AAG GCT GOC CAG CAC GTA OGC TIC GIG OGC CAG 11902 
Phe Leu Leu Leu Val Lys Ala Ala Gin His Val Arg Pt^ Val Arg Gin 
3955 3960 3965 

TGG TOC GIC TTT GGC AAG ACA TTA TGC OGA GCT CIG OCA GAG CTC CIG 11950 
Trp Ser Val Phe Gly Lys Thr Leu Cys Arg Ala Leu Pro Glu Leu Leu 
3970 3975 3980 

GOG GIC AOC TIG GGC CIG GIG GIG CIC GGG GTA GOC TAC GOC CAG CIG 11998 
Gly Val Thr Leu Gly Leu Val Val Leu Gly Val Ala Tyr Ala Gin Leu 
3985 3990 3995 

GOC ATC CIG CIC GIG TCT TOC TGT GIG GAC TOO CIC TGG AGC GIG GOC 12046 
Ala lie Leu Leu Val Ser Ser cys Val Asp Ser Leu Trp Ser Val Ala 
4000 4005 4010 4015 



CAG GOC CIG TIG GTG CTG TGC OCT GGG ACT-*??? 
Gin Ala Leu Leu Val Leu Cys Pro Gly Thr Gly Leu Ser Thr Leu Cys ' 

4020 4025 4030 



OCT GOC GAG TOC TGG CAC CIG TCA 000 CIG CIG TGT GIG GGG CTC TGG 
Pro Ala Glu Ser Trp His Leu Ser Pro Leu Leu Cys Val Gly Leu Trp 
4035 4040 4045 



12142 



GCA CIG OGG CIG TGG GGC GOC CTA OGG CTG GGG GCT GIT ATT CIC OGC 
Ala Leu Arg Leu Trp Gly Ala Leu Arg Leu Gly Ala Val lie Leu Arg 
4050 4055 4060 



12190 



TGG OGC TAC CAC GOC TIG OGT -GGA GAG CTG TAC OGG COG GOC TGG GAG 
Ttp Arg Tyr His Ala Leu Arg Gly Glu Leu Tyr Arg Pro Ala Trp Glu 
4065 4070 4075 



12238 
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OOC CAG GAC TAG GAG ATG GTS GAG TIG TIC CIG OGC AQG CTG OGC CTC 12286 
Pro Gin Asp Tyr Glu Met Val Glu Leu Phe Leu Arg Arg Leu Arg Leu 
4080 4085 4090 4095 

TGG ATG OGC CTC A9C AAG GTC AAG GAG TIC OGC CAC AAA GTC OGC TIT 12334 
Trp Nfet Gly Leu Ser Lys Val Lys Glu Phe Arg His Lys Val Arg Phe 

4100 4105 4110 

GAA GGG ATG GAG COG CTG OCC TCT CGC TOC TCC AGG GQC TOC AAG GTA 12382 
Glu Gly Met Glu Pro Leu Pro Ser Arg Ser Ser Arg Gly Ser Lys Val 
4115 4120 4125 

TOC OCG GAT GTG OOC OCA OOC AGO GCT GGC TOC GAT GOC TCG CAC OOC 12430 
Ser Pro Asp Val Pro Pro Pro Ser Ala Gly Ser Asp Ala Ser His Pro 
4130 4135 4140 

TOC AOC TOC TOC AGC CAG CTG GAT GGG CTG AGC GIG AGO CIG GGC OGG 12478 
Ser Thr Ser Ser Ser Gin Leu Asp Gly Leu Ser Val Ser Leu Gly Arg 
4145 4150 4155 

CIG GGG ACA AGG TGT GAG OCT GAG OOC TCC OGC CTC CAA GOC GTG TTC 12526 
Leu Gly Thr Arg Cys Glu Fro Glu Pro Ser Arg Leu Gin Ala Val Phe 
4160 4165 4170 4175 

GAG GOC CIG CTC AOC CAG TIT GAC CGA CTC AAC CAG GCC ACA GAG GAC 12574 
Glu Ala Leu Leu Thr Gin Phe Asp Arg Leu Asn Gin Ala Thr Glu Asp 

4180 4185 4190 

GTC TAG CAG CTG GAG CAG CAG CIG CAC AGC CIG CAA GGC OGC AGG AGC 12622 
Val Tyr Gin Leu Glu Gin Gin Leu His Ser Leu Gin Gly Arg Arg Ser 
4195 4200 4205 

AGC OGG GOG COC GOC GGA TCT TCC OGT GGC OCA TOC COG GGC CTG OGG 12670 
Ser Arg Ala Pro Ala Gly Ser Ser Arg Gly Pro Ser Pro Gly Leu Arg 
4210 4215 4220 

OCA GCA CTG OOC AGC OGC CTT GOC OGG GOC AGT OGG GCT GIG GAC CIG 12718 
Pro Ala Leu Pro Ser Arg Leu Ala Arg Ala Ser Arg Gly Val Asp Leu 
4225 4230 4235 

GOC ACT GGC COC AGC AGG ACA OCT TOG GGC CAA GAA CAA GCT CCA OCC 12766 
Ala Thr Gly Pro Ser Arg Thr Pro Ser Gly*Gln Glu Gin Gly Pro Pro 
4240 4245 4250 4255 

CAG CAG CAC TTA GTC CTC CTT CCT GGC GGG GCT GGG OOG TGG ACT OGG 12814 
Gin Gin Plis Leu Val Leu Leu Pro Gly Gly Gly Gly Pro Trp Ser Arg 

4260 4265 4270 

ACT GGA CAC OGC TCA CTA TEA CTT TCT GOC GCT GTC AAG GOC GAG GGC 12862 
Ser Gly His Arg Ser Val Leu Leu Ser Ala Ala Val Lys Ala Glu Gly 
4275 4280 4285 

CAGGCAGAATGGCTGCACCTAGCTTOCCCAGAGAGC AGG CAG GGG CAT 12910 
Gin Ala Glu Trp Leu His Val Gly Ser Pro Glu Ser Arg Gin Gly His 
4290 4295 4300 

CTG TCT GTC TCT GGG CTT CAG -CAC TIT AAA GAG GCT GTG TGG OCA AOC 12958 
Leu Ser Val Cys Gly Leu Gin His Phe Lys Glu Ala Val Trp Pro Thr 
4305 4310 4315 
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AGG ACC CAG GCT COC CTC CXT AGC TOC CTT GGG AAG GAC ACA GCA GTA 13006 
Arn Thr Gin Gly Pro Leu Pro Ser Ser Leu Gly Lys Asp Thr Ala Val 
4320 4325 4330 4335 

ITC GftC GGTT TTC TAGOCICTGA GATGCTAAIT TATTTOXOG ACJTOCTCAGG 13058 
Leu Asp Gly Phe 

TACAGC330GC TOrGOXGGC OOCAOOCrCT GGGCAGATCT OOCXrACTGC TAAGGCTGCT 13118 

GGCITCADQG AGGCJITAGCX: TGCAOOGOCG OCACOCTGOC CCTAAGrTTAT TACCTCIOCA 13178 

GTTOCTAOCX; TACTOCXTOC AOOCTCICAC TGTGTGnCTC GIGTCAGTAA TrTATATGCTr 13238 

GITAAAATGT GTATATmT GTATGnCACT ATmCACTA GGGCTGAOOG QCCTOCCCCC 13298 

AGAGCrGGCC TCCCCCAACA OCrGCrOC33C TIGGTAGCTrG TGCTGQOGTT ATGGCAGOOC 13358 

GGCTGCTGCr TGGATGCXSAG CrTOGOCTIG GGOOQC?rGCT GGGGGCACAG CTOTCrGCrA 13418 

GGCACTCTCA TCACOOCAGA GGOCTTGnCA TCCTOOCTTG COOCAGGOCA GCTTAGCAAGA 13478 

GAGCAGOGOC CAGQOCTGCT GGCATCftGGT CTGGGCAAGT AGCAGGACTA GGCATGTCAG 13538 

AGGACOOCAG GGrGGITAGA GGAAAAGACT CCTOCrGGGG GCItSGCTOOC AGGGTGGAGG 13598 

AAGGTGACTG TGICTGICTG TGrGTGOQaG OXISAOGOGC GAGICTGCTG TATGQCOCAG 13658 

GCAGOCTCAA GQOCCTOGGA GCTQGCTGTG OCTGCnCTG TCTACXIACrr CICTGOGCAT 13718 

GGOOGCTICT AGAGQCrCGA CACOCOOCCA AOOOOCGCAC CAAGCAGACA AAGTCAATAA 13778 

AAGAOCTGTC TGACIOCZAAA AAAAAAAAA 13807 

(i±) MOLECULE TYPE: protein 

(xi) SBCXJEJnICE DESCRIPTION: SBQ ID NO: 2: 

Gly Ala Ala Cys Arg Val Asn Cys Ser Gly Arg Gly Leu Arg Thr Leu 
15 10 15 

Gly Pro Ala Leu Arg lie Pro Ala Asp Ala Thr Ala Leu Asp Val Ser 
20 25 30 

His Asn Lsu Leu Arg Ala Leu Asp Val Gly Leu Leu Ala Asn Leu Ser 
35 40 45 

Ala Leu Ala Glu Leu Asp lie Ser Asn Asn Lys He Ser Thr Leu Glu 
50 • 55 60 

Glu Gly He Phe Ala Asn Leu Phe Asn Leu Ser Glu He Asn Leu Ser 
65 70 75 80 

Gly Asn Pro Phe Glu Cys Asp Cys Gly Leu Ala Trp Leu Pro Arg Trp 

85 90 95 

Ala Glu ciu Gin Gin Val Arg Val Val Gin Pro Glu Ala Ala Thr Cys 
100 105 110 



Ala Gly Pro Gly Ser Leu Ala Gly Gin Pro Leu Leu Gly He Pro Leu 
115 120 125 
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Leu Asp Ser Giy Cys Gly Glu Glu Tyr Val Ala Cys Leu Pro Asp Asn 
130 135 140 

Ser Ser Gly Thr Val Ala Ala Val Ser Phe Ser Ala Ala His Glu Gly 
145 150 155 160 

Leu Leu Gin Pro Glu Ala Cys Ser Ala Pha Cys Phe Ser Thr Gly Gin 

165 170 175 

Gly Leu Ala Ala Leu Ser Glu Gin Gly Trp Cys Leu Cys Gly Ala Ala 
180 185 190 

Gin Pro Ser Ser Ala Ser Phe Ala Cys Leu Ser Leu cys Ser Gly Pro 
195 200 205 

Pro Pro Pro Pro Ala Pro Ihr cys Arg Gly Pro Thr Leu Leu Gin His 
210 215 220 

Val Phe Pro Ala Ser Pro Gly Ala Thr Leu Val Gly Pro His Gly Pro 
225 230 235 240 

Leu Ala Ser Gly Gin Leu Ala Ala Phe His lie Ala Ala Pro Leu Pro 

245 250 255 

Val Thr Ala Thr Arg Trp Asp Phe Gly Asp Giy Ser Ala Glu Val Asp 
260 265 270 

Ala Ala Gly Pro Ala Ala Ser His Arg Tyr Val Leu Pro Gly Arg Tyr 
275 280 285 

His Val Thr Ala Val Leu Ala Leu Gly Ala Gly Ser Ala Leu Leu Gly 
290 295 300 

Thr Asp Val Gin Val Glu Ala Ala Pro Ala Ala Leu Glu Leu Val cys 
305 310 315 320 

Pro Ser Ser Val Gin Ser Asp Glu Ser Leu Asp Leu Ser lie Gin Asn 

325 330 335 

Arg Gly Gly Ser Gly Leu Glu Ala Ala Tyr Ser He Val Ala Leu Gly 
340 345 350 

Glu Glu Pro Ala Arg Ala Val His Pro Leu Cys Pro Ser Asp Thr Glu 
355 360 365 

He Phe Pro Gly Asn Gly His Cys Tyr Arg Leu Val Val Glu Lys Ala 
370 375 380 

Ala Trp Leu Gin Ala Gin Glu Gin Cys Gin Ala Trp Ala Gly Ala Ala 
385 390 395 400 

Leu Ala I^t Val Asp Ser Pro Ala Val Gin Arg Phe Lsu Val Ser Arg 

405 410 415 

Val Thr Arg Ser Leu Asp Val Trp lie Gly Phe Ser Thr Val Gin Gly 
420 425 430 

Val Glu Val Gly Pro Ala Pro .Gin Gly Glu Ala Rie Ser Leu Glu Ser 
435 440 445 

cys Gin Asn Trp Leu Pro Gly GJ.u Pro His Pro Ala Thr Ala Glu His 
450 15j 460 
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Cys Val Arg Leu Gly Pro Thr Gly Tip Cys Asn Thr Asp Leu Cys Ser 
465 470 475 480 

Ala Pro His Ser Tyr Val Cys Glu Leu Gin Pro Gly Gly Pro Val Gin 

485 490 495 

Asp Ala Glu Asn Leu Leu Val Gly Ala Pro Ser Gly Asp Leu Gin Gly 
500 505 510 

Pro Leu Thr Pro Leu Ala Gin Gin Asp Gly Leu Ser Ala Pro His Glu 
515 520 525 . 

Pro Val Glu Val Met Val Phe Pro Gly Leu Arg Leu Ser Arg Glu Ala 
530 535 540 

Phe Leu Thr Thr Ala Glu Phe Gly Thr Gin Glu Leu Arg Arg Pro Ala 
545 550 555 560 

Gin Leu Arg Leu Gin Val Tyr Arg Leu Leu Ser Thr Ala Gly Thr Pro 

565 570 575 

Glu Asn Gly Ser Glu Pro Glu Ser Arg Ser Pro Asp Asn Arg Thr Gin 
580 585 590 

Leu Ala Pro Ala Cys Met Pro Gly Gly Arg Trp Cys Pro Gly Ala Asn 
595 600 605 

lie Cys Lsu Pro Leu Asp Ala Ser Cys His Pro Gin Ala cys Ala Asn 
610 615 620 

Gly Cys Thr Ser Gly Pro Gly Lsu Pro Gly Ala Pro Ala Leu Trp 
625 630 635 640 

Arg Glu Phe Leu Phe Ser Val Ala Ala Gly Pro Pro Ala Gin Tyr Ser 

645 650 655 

Val Thr Leu His Gly Gin Asp Val Leu Met Leu Pro Gly Asp Leu Val 
660 665 670 

Gly Leu Gin His Asp Ala Gly Pro Gly Ala Leu Leu His Cys Ser Pro 
675 680 685 

Ala Pro Gly His Pro Gly Pro Gin Ala Pro Tyr Leu Ser Ala Asn Ala 
690 695 700 

Ser Ser Trp Leu Pro His Leu Pro Ala Gin Leu Glu Gly Thr Trp Ala 
705 710 715 720 

Cys Pro Ala Cys Ala Leu Arg Leu Leu Ala Ala Thr Glu Gin Leu Thr 

725 730 735 

Val Leu Leu Gly Leu Arg Pro Asn Pro Gly Leu Arg Nfet Pro Gly Arg 
740 745 750 

Tyr Glu Val Arg Ala Glu Val Gly Asn Gly Val Ser Arg His Asn Leu 
755 760 765 

Ser Cys Ser P^ Asp Val Val Ser Pro Val Ala Gly Leu Arg Val lie 
770 775 780 

Tyr Pro Ala Pro Arg Asp Glv Arg Leu T^t: Val Pro Thr Asn Gly Ser 
785 790 " 795 800 
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Ala Leu Val Leu Gin Val Asp Ser Gly Ala Asn Ala Thr Ala Thr Ala 

BC6 810 815 

Arg Trp Pro Gly Gly Ser Val Ser Ala Arg Phe Glu Asn Val Cys Pro 
820 825 830 

Ala Leu Val Ala Thr Phe Val Pro Gly Cys Pro Trp Glu Thr Asn Asp 
835 840 845 

Thr Leu Pt^ Ser Val Val Ala Leu Pro Trp Leu Ser Glu Gly Glu His 
850 855 860 

Val Val Asp Val Val Val Glu Asn Ser Ala Ser Arg Ala Asn Leu Ser 
865 870 B75 880 

Leu Arg Val Thr Ala Glu Glu Pro He Cys Gly Leu Arg Ala Thr Pro 

885 890 895 

Ser Pro Glu Ala Arg Val Leu Gin Gly Val Leu Val Arg Tyr Ser Pro 
900 905 910 

Val Val Glu Ala Gly Ser Asp NSet Val Phe Arg Txp Thr He Asn Asp 
915 920 925 

Lys Gin Ser Leu Thr Phe Gin Asn Val Val Phe Asn Val He Tyr Gin 
930 935 940 

Ser Ala Ala Val Phe Lys Leu Ser Leu Thr Ala Ser Asn His Val Ser 
945 950 955 960 

Asn Val Thr Val Asn Tyr Asn Val Thr Val Glu Arg Met Asn Arg Met 

965 970 975 

Gin Gly Leu Gin Val Ser Thr Val Pro Ala Val Leu Ser Pro Asn Ala 
980 985 990 

Thr Leu Val Leu Thr Gly Gly Val Leu Val Asp Ser Ala Val Glu Val 
995 1000 1005 

Ala Phe Leu Trp Asn Phe Gly Asp Gly Glu Gin Ala Leu His Gin Phe 
1010 1015 1020 

Gin Pro Pro lyr Asn Glu Ser Phe Pro Val Pro Asp Pro Ser Val Ala 
1025 1030 1035 1040 

Gin Val Leu Val Glu His Asn Val Met His Thr Tvr Ala Ala Pro Gly 

1045 1050 1055 

Glu Tyr Leu Leu Thr Val Leu Ala Ser Asn Ma Phe Glu Asn Leu Thr 
1060 1065 1070 

Gin Gin Val Pro Val Ser Val Arg Ala Ser Leu Pro Ser Val Ala Val 
1075 1080 1085 

Gly Val Ser Asp Gly Val Leu Val Ala Gly Arg Pro Val Thr Phe Tyr 
1090 1095 1100 

Pro His Pro Leu Pro Ser Pro .Gly Gly Val Leu lyr Thr Trp Asp Phe 
1105 1110 1115 1120 

Gly Asp Gly Ser Pro Val Leu Thr Gin Ser Gin Pro Ala Ala I^sn His 

1125 1 1.30 1135 
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Ihr Tyr Ala Ser Arg Gly Itir Tyr His Val Arg Leu Glu Val Asn Asn 
1140 1145 1150 

Thr Val Ser Gly Ala Ala Ala Gin Ala Asp Val Arg Val Phe Glu Glu 
1155 1160 1155 

Leu Arg Gly Leu Ser Val Asp Met Ser leu Ala Val Glu Gin Gly Ala 
1170 1175 1180 

Pro Val Val Val Ser Ala Ala Val Gin Thr Gly Asp Asn lie Thr Trp 
1185 1190 U95 1200 

Thr Pte Asp Me-t Gly Asp Gly Thr Val Leu Ser Gly Pro Glu Ala Tta: 

1205 1210 1215 

Val Glu His Val Tyr Leu Arg Ala Gin Asn Cys Thr Val Thr Val Gly 
1220 1225 1230 

Ala Ala Ser Pro Ala Gly His Leu Ala Arg Ser Leu His Val Leu Val 
1235 1240 1245 

Phe Val Leu Glu Val Leu Arg Val Glu Pro Ala Ala Cys lie Pro Thr 
1250 1255 1260 

Gin Pro Asp Ala Arg Leu Thr Ala Tyr Val Thr Gly Asn Pro Ala His 
1265 1270 1275 1280 

Tyr Leu Phe Asp Trp Thr Phe Gly Asp Gly Ser Ser Asn Thr Thr Val 

1285 1290 1295 

Arg Gly Cys Pro Thr Val Thr His Asn Phe Thr Arg Ser Gly Thr Phs 
1300 1305 1310 

Pro Leu Ala Leu Val Leu Ser Ser Arg Val Asn Arg Ala His Tyr Phe 
1315 1320 1325 

Thr Ser He Cys Val Glu Pro Glu Val Gly Asn Val Thr leu Gin Pro 
1330 1335 1340 

Glu Arg Gin Phe Val Gin Leu Gly Asp Glu Ala Trp Leu Val Ala Cys 
1345 1350 1355 1350 

Ala Txp Pro Pro Pha Pro Tyr Arg Tyr Thr Trp Asp Phe Gly Thr Glu 

1365 1370 1375 

Glu Ala Ala Pro Thr Arg Ala Arg Gly Pro Glu Val Thr Phe He Tyr 
1380 1385 1390 

Arg Asp Pro Gly Ser Tyr Leu Val Thr Val Thr Ala Ser Asn Asn He 
1395 1400 1405 

Ser Ala Ala Asn Asp Ser Ala Leu Val Glu Val Gin Glu Pro Val Leu 
1410 1415 1420 

Val Thr Ser He Lys Val Asn Gly Ser Leu Gly Leu Glu Leu Gin Gin 
1425 1430 1435 1440 

Pro Tyr Leu Phe Ser Ala Val Gly Arg Gly Arg Pro Ala Ser Tyr Leu 

1445 1450 1455 
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Trp Asp Leu Gly Asp Gly Gly Trp Leu Glu Gly Pro Glu Val Thr His 
1460 1465 1470 

Ala Tyr Asn Ser Thr Gly Asp Phe Thr Val Arg Val Ala Gly Trp Asn 
1475 1480 1485 

Glu Val Ser Arg Ser Glu Ala Trg Leu Asn Val Thr Val Lys Arg Arg 
1490 1495 1500 

Val Arg Gly Leu Val Val Asn Ala Ser Arg Thr Val Val Pro Leu Asn 
1505 1510 1515 152( 

Gly Ser Val Ser Pte Ser Thr Ser Leu Glu Ala Gly Ser Asp Val Arg 

1525 1530 1535 



Tyr Ser Trp Val Leu Cys Asp Arg Cys Thr Pro lie Pro Gly Gly Pro 
1540 1545 1550 

Thr He Ser Tyr Thr Phe Arg Ser Val Gly Thr Phe Asn lie He Val 
1555 1560 1565 

Thr Ala Glu Asn Glu Val Gly Ser Ala Gin Asp Ser He Phs Val Tyr 
1570 .1575 1580 

Val Leu Gin Leu lie Glu Gly Leu Gin Val Val Gly Gly Gly Arg Tyr 
1585 1590 1595 160C 

Phe Pro Thr Asn His Thr Val Gin Leu Gin Ala Val Val Arg Asp Gly 

1605 1610 1615 

Thr Asn Val Ser Tyr Ser Trp Thr Ala Trp Arg Asp Aig Gly Pro Ala 
1620 1625 1630 

Leu Ala Gly Ser Gly Lys Gly Phe Ser Leu Thr Val Leu Glu Ala Gly 
1635 1640 1645 

Thr Tyr His Val Gin Leu Arg Ala Thr Asn Met Leu Gly Ser Ala Trp 
1650 1655 1660 

Ala Asp Cys Thr Met Asp Phe Val Glu Pro Val Gly Trp Leu Met Val 
1665 1670 i675 168C 

Thr Ala Ser Pro Asn Pro Ala Ala Val Asn Thr Ser Val Thr Leu Ser 

1685 1690 1695 

Ala Glu Leu Ala Gly Gly Ser Gly Val Val "Tyr Thr Trp Ser Leu Glu 



Glu Gly Leu Ser Trp Glu Thr Ser Glu Pro Phe Thr Thr His Ser Phe 
1715 1720 1725 

Pro Thr Pro Gly Leu His Leu Val Thr Met Thr Ala Gly Asn Pro Leu 
1730 1735 1740 

Gly Ser Ala Asn Ala Thr Val Glu Val Asp Val Gin Val Pro Val Ser 
1745 1750 1755 176( 

Gly Leu Ser He Arg Ala Ser Glu Pro Gly Gly Ser Phe Val Ala Ala 



1700 



1705 



1710 



1765 



1770 



1775 
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Gly Ser Ser Val Pro Phe Trp Gly Gin Leu Ala Thcr Gly Tte Asn Val 
1780 1785 1790 

Ser Trp Cys Trp Ala Val Pro Gly Gly Ser Ser Lys Arg Gly Pro His 
1795 1800 1805 

Val TtTT Met Val Phe Pro Asp Ala Gly Thr Phe Ser lie Arg teu Asn 
1810 1815 1820 

Ala Ser Asn Ma Val Ser Trp Val Ser Ala Itir T^r Asn Leu Thr Ala 
1825 1830 1835 1840 

Glu Glu Pro lie Val Gly Leu Val Leu Trp Ala Ser Ser Lys Val Val 

1845 1850 1855 

Ala Pro Gly Gin Leu Val His Phe Gin lie Leu Leu Ala Ala Gly Ser 
1860 1865 1870 

Ala Val Thr Phe Arg Leu Gin Val Gly Gly Ala Asn Pro Glu Val Leu 
1875 18B0 1885 

Pro Gly Pro Arg Phe Ser His Ser Phe Pro Arg Val Gly Asp His Val 
1890 1895 1900 

Val Ser Val Arg Gly Lvs Asn His Val Ser Trp Ala Gin Ala Gin Val 
1905 1910 1915 1920 

Arg lie Val Val Leu Glu Ala Val Ser Gly Leu Gin ^at Pro Asn Cys 

1925 1930 1935 

Cys Glu Pro Gly lie Ala Thr Gly Thr Glu Arg Asn Pte Thr Ala Arg 
1940 1945 1950 

Val Gin Arg Gly Ser Arg Val Ala T^r Ala Trp Tyr Phe Ser Leu Gin 
1955 1960 1965 

Lys Val Gin Gly Asp Ser Leu Val lie Leu Ser Gly Arg Asp Val Thr 
1970 1975 1980 

Tyr Thr Pro Val Ala Ala Gly Leu Leu Glu lie Gin Val Arg Ala Phe 
1985 1990 1995 2000 

Asn Ala Leu Gly Ser Glu Asn Arg Thr Leu Val Leu Glu Val Gin Asp 

2005 2010 2015 

Ala Val Gin Tyr Val Ala Leu Gin Ser Gly Pro Cys Phe Thr Asn Arg 
2020 2025 2030 

Ser Ala Gin Phe Glu Ala Ala Thr Ser Pro Ser Pro Arg Arg Val Ala 
2035 2040 2045 

Tyr His Trp Asp Hve Gly Asp Gly Ser Pro Gly Gin Asp Thr Asp Glu 
2050 2055 2060 

Pro Arg Ala Glu His Ser Tyr Leu Arg Pro Gly Asp Tyr Arg Val Gin 
2065 2070 2075 2080 

Val Asn Ala Ser Asn Leu Val Ser Phe Phe Val Ala Gin Ala Thr Val 

2085 2090 2095 
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Thr Val Gin Val Leu Ala Cys Arg Glu Pro Glu Val Asp Val Val Leu 
2100 2105 2110 

Pro leu Gin Val leu Met Arg Arg Ser Gin Arg Asn Tyr Leu Glu Ala 
2115 2120 2125 

His Val Asp Leu Arg Asp Cys Val Thr Tyr Gin Thr Glu Tyr Arg Trp 
2130 2135 2140 

Glu Val Tyr Arg Thr Ala Ser cys Gin Arg Pro Gly Arg Pro Ala Arg 
2145 2150 2155 2160 

Val Ala Leu Pro Gly Val Asp Val Ser Arg Pro Arg Leu Val Leu Pro 

2165 2170 2175 • 

Arg Leu Ala Leu Pro Val Gly His Tyr cys Phe Val Fte Val Val Ser 
2180 2185 2190 

Rie Gly Asp Thr Pro Lbu Thr Gin Ser lie Gin Ala Asn Val Thr Val 
2195 2200 2205 

Ala Pro Glu Arg Leu Val Pro lie lie Glu Gly Gly Ser Tyr Arg Val 
2210 2215 2220 

Trp Ser Asp Thr Arg Asp Leu Val Leu Asp Gly Ser Glu Ser Tyr Asp 
2225 2230 2235 2240 

Pro Asn Leu Glu Asp Gly Asp Gin Thr Pro Leu Ser Phe His Trp Ala 

2245 2250 2255 

cys Val Ala Ser Thr Gin Arg Glu Ala Gly Gly Cys Ala Leu Asn Phe 
2260 2265 2270 

Gly Pro Arg Gly Ser Ser Thr Val Thr He Pro Arg Glu Arg Leu Ala 
2275 2280 2285 

Ala Gly Val Glu Tyr Tbr Phe Ser Leu Thr Val Trp Lys Ala Gly Arg 
2290 2295 2300 

Lys Glu Glu Ala Thr Asn Gin Thr Val Leu He Arg Ser Gly Arg Val 
2305 2310 2315 2320 

Pro He Val Ser Leu Glu Cys Val Ser Cys Lys Ala Gin Ala Val Tyr 

2325 2330 2335 

Glu Val Ser Arg Ser Ser Tyr Val lyr Leu Glu Gly Arg Cys Leu Asn 
2340 2345 2350 

cys Ser Ser Gly Ser Lys Arg Gly Arg Trp Ala Ala Arg Thr Phe Ser 
2355 2360 2365 

Asn Lys Thr Leu Val leu Asp Glu Thr Thr Thr Ser Thr Gly Ser Ala 
2370 2375 2380 

Gly N5et Arg Leu Val Leu Arg Arg Gly Val Leu Arg Asp Gly Glu Gly 
2385 2390 2395 2400 

Tyr Thr Phe Thr Leu Thr Val Leu Gly Arg Ser Gly Glu Glu Glu Gly 

2405 2410 2415 
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Cys Ala Ser lie Arg Leu Ser Pro Asn Arg Pro Pro Leu Gly Gly Ser 
2420 2425 2430 

cys Arg Leu Phe Pro Leu Gly Ala Val His Ala Leu Thr Thr Lys Val 
2435 2440 2445 

His Phe Glu Cys Thr Gly Trp His Asp Ala Glu Asp Ala Gly Ala Pro 
2450 2455 2460 

leu Val Tyr Ala Leu Leu Leu Arg Arg Arg Gin Gly His C^^ Glu 
2465 2470 2475 2480 

Glu Phs Cys Val lyr Lys Gly Ser Leu Ser Ser Tyr Gly Ala Val Leu 

2485 2490 2495 

Pro Pro Gly Phe Arg Pro His Phe Glu Val Gly Leu Ala Val Val Val 
2500 2505 2510 

Gin Asp Gin leu Gly Ala Ala Val Val Ala Leu Asn Arg Ser Leu Ala 
2515 2520 2525 

lie Thr Leu Pro Glu Pro Asn Gly Ser Ala Thr Gly Leu Thr Val Trp 
2530 2535 2540 

Leu His Gly Leu Thr Ala Ser Val Leu Pro Gly Leu Leu Arg Gin Ala 
2545 2550 2555 2560 



Asp Pro Gin His Val lie Glu Tyr Ser leu Ala Leu Val Thr Val Leu 

2565 2570 2575 

Asn Glu Tyr Glu Arg Ala Leu Asp Val Ala Ala Glu Pro Lys His Glu 
2580 2585 2590 

Arg Gin His Arg Ala Gin lie Arg Lys Asn lie Thr Glu Thr Leu Val 
2595 2600 2605 

Ser Leu Arg Val His Thr Val Asp Asp lie Gin Gin lie Ala Ala Ala 
2610 2615 2620 

Leu Ala Gin Cys Met Gly Pro Ser Arg Glu Leu Val Cys Arg Ser Cys 
2525 2630 2635 2640 

lau Lys Gin Thr Leu His Lys Leu Glu Ala Met Met Leu lie Leu Gin 

2645 2650 2655 

Ala Glu Thr Thr Ala Gly Thr Val Thr Pro Thr Ala lie Gly Asp Ser 
2660 2665 2670 

lie Leu Asn lie Thr Gly Asp Leu lie His Leu Ala Ser Ser Asp Val 
2675 2680 2685 

Arg Ala Pro Gin Pro Ser Glu Leu Gly Ala Glu Ser Pro Ser Arg Met 
2690 2695 2700 

Val Ala Ser Gin Ala Tyr Asn Leu Thr Ser Ala Leu Met Arg lie Leu 
2705 2710 2715 2720 

Met Arg Ser Arg Val Leu Asn Glu Glu Pro Leu Thr Leu Ala Gly Glu 

2725 2730 2735 
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Glu lie Val Ala Gin Gly Lys Arg Ser Asp Pro Aig Ser Leu Leu 
2740 2745 2750 

Tyr Gly Gly Ala Pro Gly Pro Gly Cys His Phe Ser lie Pro Glu Ala 
2755 2760 2765 

Phe Ser Gly Ala Leu Ala Asn Leu Ser Asp Val Val Gin Leu lie Pte 
2770 2775 2780 

Leu Val Asp Ser Asn Pro Phe Pre Phe Gly Tyr He Ser Asn Tyr Ihr 
2785 2790 2795 2800 

Val Ser Thr Lys Val Ala Ser Met Ala Phe Gin Thr Gin Ala Gly Ala 

2805 2810 2815 

Gin He Pro He Glu Arg lau Ala Ser Glu Arg Ala He Thr Val Lys 
2820 2825 2830 

Val Pro Asn Asn Ser Asp Trp Ala Ala Arg Gly His Arg Ser Ser Ala 
2835 2840 2845 

Asn Ser Ala Asn Ser Val Val Val Gin Pro Gin Ala Ser Val Gly Ala 
2850 2855 2850 

Val Val Thr Leu Asp Ser Ser Asn Pro Ala Ala Gly Leu His Leu Gin 
2865 2870 2875 2880 

Leu Asn Tyr Thr Leu Leu Asp Gly His Tyr Leu Ser Glu Glu Pro Glu 

2885 2890 2895 

Pro Tyr Leu Ala Val Tyr Leu His Ser Glu Pro Arg Pro Asn Glu His 
2900 2905 2910 

Asn Cys Ser Ala Ser Arg Arg He Arg Pro Glu Ser Leu Gin Gly Ala 
2915 2920 2925 

Asp His Arg Pro Tyr Thr Phe Phe He Ser Pro Gly Ser Arg Asp Pro 
2930 2935 2940 

Ala Gly Ser Tyr His Leu Asn Leu Ser Ser His Phe Arg Trp Ser Ala 
2945 2950 2955 2960 

Leu Gin Val Ser Val Gly Leu Tyr Thr Ser Leu Cys Gin Tyr Phe Ser 

2965 2970 2975 

Glu Glu Asp Met Val Trp Arg Thr Glu Gly Leu Leu Pro Leu Glu Glu 
2980 2985 2990 

Thr Ser Pro Arg Gin Ala Val Cys Leu Thr Arg His Leu Thr Ala Phe 
2995 3000 3005 

Gly Ala Ser Leu Phe Val Pro Pro Ser His Val Arg Phe Val Phe Pro 
3010 3015 3020 

Glu Pro Thr Ala Asp Val Asn Tyr He Val Met Leu Thr Cys Ala Val 
3025 3030 3035 3040 

Cys Leu Val Thr Tyr Met Val Met Ala Ala He Leu His Lys Leu Asp 

3045 3050 3055 
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Gin Leu Asp Ala Ser Arg Gly Airg Ala lie Pro Phe Cys Gly Gin Arg 
3060 3065 3070 

Gly Arg Phe Lys Tyr Glu lie Leu Val Lys Thr Gly Trp Gly Arg Gly 
3075 3080 3085 

Ser Gly Thr Thr Ala His Val Gly He Met Leu Tyr Gly Val Asp Ser 
3090 3095 3100 

Arg Ser Gly His Arg His Leu Asp Gly Asp Arg Ala Pte His Arg Asn 
3105 3110 3115 3120 

Ser Leu Asp He Phe Arg He Ala Thr Pro His Ser Leu Gly Ser Val 

3125 3130 3135 

Trp Lys He Arg Val Trp His Asp Asn Lys Gly Leu Ser Pro Ala Trp 
3140 3145 3150 

Phe Leu Gin His Val He Val Arg Asp Leu Gin Thr Ala Arg Ser Ala 
3155 3160 3165 

Phe Phe Leu Val Asn Asp Trp Leu Ser Val Glu Thr Glu Ala Asn Gly 
3170 3175 3180 

Gly Leu Val Glu Lys Glu Val Leu Ala Ala Ser Asp Ala Ala Leu Leu 
3185 3190 3195 3200 

Arg Rie Arg Arg Leu Leu Val Ala Glu Leu Gin Arg Gly Phe Phs Asp 

3205 3210 3215 

Lys His He Trp Leu Ser He Trp Asp Arg Pro Pro Arg Ser Arg Phe 
3220 3225 3230 

Thr Arg He Gin Arg Ala Thr Cys Cys Val Leu Leu He Cys Leu Phe 
3235 3240 3245 

Leu Gly Ala Asn Ala Val Trp Tyr Gly Ala Val Gly Asp Ser Ala T^r 
3250 3255 3260 

Ser Thr Gly His Val Ser Arg Leu Ser Pro Leu Ser Val Asp Thr Val 
3265 3270 3275 3280 

Ala Val Gly Leu Val Ser Ser Val Val Val Tyr Pro Val lyr Leu Ala 

3285 3290 3295 

He Leu'Phe Leu Phe Arg Met Ser Arg Ser Lys Val Ala Gly Ser Pro 
3300 3305 3310 

Ser Pro Thr Pro Ala Gly Gin Gin Val Leu Asp He Asp Ser Cys Leu 
3315 3320 3325 

Asp Ser Ser Val Leu Asp Ser Ser Phe Leu Itir Phe Ser Gly Leu His 
3330 3335 3340 

Ala Glu Ala Phe Val Gly Gin Met Lys Ser Asp Leu Phe Leu Asp Asp 
3345 3350 3355 3360 

Ser Lys Ser Leu Val Cys Trp Pro Ser Gly Glu Gly Thr Leu Ser Trp 

3365 3370 3375 
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Event ID 1000,1001 Logged Every 5 Min in Application Event Log 
[Q290647] 

PSS ID Number: Q290647 
Article last modified on 05-02-2001 

:2000 



The information in this article applies to: 

- Microsoft Windows 2 000 Advanced Server 
- Microsoft Windows 2 000 Server 



SYMPTOMS 



Group Policies are not replicated between domain controllers; therefore, users 
do not receive Group Policies for computers. Event ID 1000,1001 may be logged in 
the Application Log every five minutes with the following information: 

Type : Error 
Event ID: 1000 
Source : Userenv 
Category : None 
User: NT AUTHORITY\SYSTEM 

Description: Windows cannot access the registry information at 
\\<domain>\sysvol\<domain>\Policies\{31B2F340-016D-llD2-94 5F-00C04FB98 4F9}\Machi 

with (5) . 

Type: Error 
Event ID: 1001 
Source : SceCli 
Category : None 
User: N/A 

Description: Security policy cannot be propagated. Cannot access the template. 
Error code =3 . 

\\<domain>\sysvol\<domain>\Policies\{31B2F34 0-016D-llD2-94 5F-OOC04FB98 4F9}\Machi 
NT\SecEdit\GptTmpl . inf . 

Type: Error 
Event ID: 1000 
Source : Userenv 
Category: None 
User: NT AUTHORITY\ SYSTEM 

Description: The Group Policy client-side extension Security was passed flags 
(17) and returned a failure status code of (3) . 

CAUSE 

This issue may occur if you assign improper permissions to the 
%SystemRoot%\Winnt\Sysvol folder or when you assign improper groups to Bypass 
Traverse Checking User Rights Assignment . 

RESOLUTION 
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To resolve this issue: 

1. Set the folder security permissions. To access the security permissions, 
right-click the folder, click Properties, and then click the Security tab. 

- %SystemRoot%\Winnt\Sysvol : 

Administrators: F\ill Control 
Authenticated Users: Read, Read & Execute, and List Folder Contents 
Creator Owner: Nothing selected 

Server Operators: Read, Read & Execute, and List Folder Contents 
Sy s t em : Ful 1 Control 

Click to clear: "Allow inheritable permissions from parent to propagate to 
this object" 

- %SystemRoot%\Winnt\Sysvol\Sysvol : 

This folder inherits all of its permissions from its parent. 

- %SystemRoot%\Winnt\Sysvol\Sysvol\<domain> : 

This folder inherits all of its permissions from its parent. 

- %SystemRoot%\Winnt\Sysvol\Sysvol\<domain>\Policies : 

Administrators: Full Control 
Authenticated Users: Read, Read & Execute, and List Folder Contents 
Creator Owner: Nothing selected 

Group Policy Creator Owners: Read, Read & Execute, List Folder Contents, 
Modify, and Write 

Server Operators: Read, Read & Execute, and List Folder Contents 
Sy s t em : Fu 1 1 Con t ro 1 

Click to clear: "Allow inheritable permissions from parent to propagate to 
this object" 

- %SystemRoot%\Winnt\Sysvol\Sysvol\<domain>\ Policies : 

Click to select for all subfolders and files: "Allow inheritable permissions 
from parent to propagate to this object" 

2. Open Active Directory Users and Computers: Click Start, click Programs, and 
then click Administrative Tools. 

3. Expand Active Directory Users and Computers, and then expand the domain name. 

4. Right-click Domain Controllers, and then click Properties. 

5. On the Group Policy tab, click "Default Domain Controllers Policy", and then 
click Edit. 

6. Expand the folders: 

Computer Configuration 
Windows Settings 
Security Settings 
Local Policies 

7. Click User Rights Assignment, and then double-click "Bypass traverse 
checking". The following default settings should be present: 

Authenticated Users 
Everyone 
Administrators 

To add these groups if they are not present, click Add, and then click Browse. 

8 . At a command prompt , type : 

secedit /ref reshpolicy machine_policy /enforce 
MORE INFORMATION 
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For additional information, click the article numbers below to view the articles 
in the Microsoft Knowledge Base: 

Q2 71213 Event ID 1000 and 1001 Repeat Every 5 Minutes in the Event Log 

Q259398 SceCli Event ID 1001 and UserEnv Event ID 1000 When Dfs Client Is 
Disabled 

Q2;-35923 Error Messages Every 5 Minutes Report Events 1000, 1001, and 13508, 
Citing Replication Trouble 

Additional query words: GPO; 1000; 1001; permissions; sysvol 
Keywords : kberrmsg kbtool 

Technology : kbwin2000AdvServSearch kbwin2000Ssearch kbPictureIt2000 kbWinA 

Version : :2000 

Issue type : kbprb 

Copyright Microsoft Corporation 2001. 
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Error Messages Every 5 Minutes Report Events 1000, 1001, and 135 
[Q285923] 

PSS ID Nuinber: Q285923 
Article last modified on 01-30-2001 

:2000 



The information in this article applies to: 

- Microsoft Windows 2000 Server 

- Microsoft Windows 2000 Advanced Server 

- Microsoft Windows 2000 Datacenter Server 



SYMPTOMS 



You may find that the following error messages are recorded in Event Viewer 
every 5 minutes on domain controller computers and every 2 0 minutes on member 
server computers : 

Userenv 10 GO 

Windows cannot access the registry information at 
Wdomainname . com\sysvol\domainname . com\Policies\ { 

f ile: //Wdomainname . com\sysvol\domainname . com\Policies\ { 31B2F340 - 016D 
D-llD2-945F-O0C04FB984F9}\Machine\registry .pol with (1398) . 

SceCli 1001 

Security policy cannot be propagated. Cannot access the template. Error 
code=3 . 

Userenv 1000 

The Group Policy client-side extension Security was passed flags (17) and 
returned a failure status code of (3) . 

NtFrs 13508 

Description: The File Replication Service is having trouble enabling 
replication from (computername) to (computername) for c:\winnt\sysvol\domain; 
retrying. 

RESOLUTION 

To resolve this issue, synchronize the computers with the domain controller 
clock time. Follow these steps: 

1. Run the following command on all computers to synchronize the clock time with 
the domain controller: 

"net time \\ (domain controller name) /set /y" (without the quotation marks) 

2. Stop and then restart the File Replication Service on all servers that are 
experiencing the problem. 

3. Open Event Viewer to make sure that the errors are no longer occurring. 
Additional query words: 



Keywords : 

Technology : kbwin2000AdvServSearch kbwin2000DataServSearch kbwin2000Ssearc 
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Version : :2000 

Issue type : kbprb 
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Pro Asp Leu Leu Ser 
3380 . 



Asp Pxo Ser lie Val 



Gly Ser Asn Leu Arg 



Gin 



3385 



3390 



Leu Ala Acg Gly Gin Ala Gly His Gly Leu Gly Pro Glu Glu Asp Gly 
3395 3400 3405 

Phe Ser Leu Ala Ser Pro Tyr Ser Pro Ala Lys Ser Phe Ser Ala Ser 
3410 3415 3420 

Asp Glu Asp Leu lie Gin Gin Val Leu Ala Glu Gly Val Ser Ser Pro 
3425 3430 3435 344( 

Ala Pro Ttir Gin Asp Thr His Met Glu Thr Asp Leu Leu Ser Ser Leu 

3445 3450 3455 

Ser Ser Ttxr Pxo Gly Glu Lys Thr Glu Thr Leu Ala leu Gin Arg Leu 
3460 3465 3470 

Gly Glu Leu Gly Pro Pro Ser Pro Gly leu Asn Trp Glu Gin Pro Gin 
3475 3480 3485 

Ala Ala Arg Leu Ser Arg Thr Gly Leu Val Glu Gly Leu Arg Lys Arg 
3490 3495 3500 

Leu Leu Pro Ala Trp Cys Ala Ser Leu Ala His Gly Leu Ser Leu Leu 
3505 3510 3515 352( 

Leu Val Ala Val Ala Val Ala Val Ser Gly Trp Val Gly Ala Ser Phe 

3525 3530 3535 

Pro Pro Gly Val Ser Val Ala Trp Leu Leu Ser Ser Ser Ala Ser Phe 
3540 3545 3550 

leu Ala Ser Phe Leu Gly Trp Glu Pro Leu Lys Val Leu Leu Glu Ala 
3555 3560 3565 

Leu Tyr Phe Ser Leu Val Ala Lys Arg Leu His Pro Asp Glu Asp Asp 
3570 3575 3580 



Thr Leu Val Glu Ser Pro Ala Val Thr Pro Val Ser Ala Arg Val Pro 
3585 3590 3595 36a 

Arg Val Arg Pro Pro His Gly Phe Ala Leu Phe Leu Ala Lys Glu Glu 

3605 3610 3615 

Ala Arg Lys Val Lys Arg Leu His Gly Met Leu Arg Ser Leu Leu Val 
3620 3625 3630 

"Tyr rtet Leu Phe Leu Leu Val Thr Leu Leu Ala Ser Tyr Gly Asp Ala 
3635 3640 3645 

Ser cys His Gly His Ala Tyr Arg Leu Gin Ser Ala lie Lys Gin Glu 
3650 3555 3660 

Leu His Ser Arg Ala Pt^ Leu Ala lie Thr Arg Ser Glu Glu Leu Trp 
3665 3670 . 3575 3681 

Pro Trp Met Ala His Val Leu hsu Pro Tyr Val His Gly Asn Gin Ser 



3685 



3690 



3695 



suBsrrruTE 
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Ser Pro Glu Leu Gly Pro Pro Arg Leu Arg Gin Val Arg Leu Gin Glu 
3700 3705 3710 

Ala Leu Tyr Pro Asp Pro Pro Gly Pro Arg Val His Thr Cys Ser Ala 
3715 3720 3725 

Ala Gly Gly Phe Ser Thr Ser Asp Tyr Asp Val Gly Trp Glu Ser Pro 
3730 3735 3740 

His Asn Gly Ser Gly Thr Trp Ala Tyr Ser Ala Pro Asp Leu Leu Gly 
3745 3750 3755 3760 

Ala Trp Ser Trp Gly Ser Cys Ala Val Tyr Asp Ser Gly Gly Tyr Val 

3765 3770 3775 

Gin Glu Leu Gly Leu Ser Leu Glu Glu Ser Arg Asp Arg Leu Arg Phe 
3780 3785 3790 

Leu Gin Leu His Asn Trp Leu Asp Asn Arg Ser Axg Ala Val Phe Leu 
3795 . 3800 3805 

Glu. Leu Thr Arg Tyr Ser Pro Ala Val Gly Leu His Ala Ala Val Thr 
3810 3815 3820 

Leu Arg Leu Glu Phe Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu Ser 
3825 3830 3835 3840 

Val Arg Pro Phe Ala Leu Axg Arg Leu Ser Ala Gly Leu Ser Leu Pro 

3845 3850 3855 

Leu Leu Thr Ser Val Cys Leu Leu Leu Phe Ala Val His Phe Ala Val 
3850 3865 3870 

Ala Glu Ala Arg Thr Trp His Azg Glu Gly Arg Trp Arg Val Leu Arg 
3875 3880 3885 

Leu Gly Ala Trp Ala Arg Trp Leu Leu Val Ala Leu Thr Ala Ala Thr 
3890 3895 3900 

Ala Leu Val Arg Leu Ala Gin Leu Gly Ala Ala Asp Arg Gin Trp Thr 
3905 3910 3915 3920 

Arg Phe Val Arg Gly Arg Pro Arg Arg Phe Thr Ser Phe Asp Gin Val 

3925 3930 3935 

Ala His Val Ser Ser Ala Ala Arg Gly Leu Ala Ala Ser Leu Leu Phe 
3940 3945 3950 

Leu teu Leu Val Lys Ala Ala Gin His Val Arg Phe Val Arg Gin Trp 
3955 3960 3965 

Ser Val Phe Gly Lys Ttir Leu Cys Arg Ala Leu Pro Glu Leu Leu Gly 
3970 3975 3980 

Val Thr Leu Gly Leu Val Val Leu Gly Val Ala Tyr Ala Gin Leu Ala 
3985 3990 3995 4000 

He Leu Leu Val Ser Ser Cys Val Asp Ser Leu Trp Ser Val Ala Gin 

4005 4010 4015 
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Kia Leu Leu Val Leu Cys Pro Gly Thr Gly Leu Ser Thr Leu Cys Pro 
4020 4025 4030 

Ala Glu Ser Trp His Leu Ser Pro Leu Leu Cys Val Gly Leu Trp Ala 
4035 4040 4045 

Leu Arg Leu Trp Gly Ala Leu Arg Leu Gly Ala Val He Leu Arg Trp 
4050 4055 4060 

Arg Tyr His Ala Leu Arg Gly Glu Leu lyr Arg Pro Ala Trp Glu Pro 
4065 4070 4075 4080 

Gin Asp Tyr Glu tet Val Glu Leu Phe Leu Arg Arg Leu Arg Leu Trp 

4085 4090 4095 

Met Gly Leu Ser Lys Val Lys Glu Pte Arg His Lys Val Arg Phe Glu 
4100 4105 4110 

Gly Met Glu Pro Leu Pro Ser Arg Ser Ser Arg Gly Ser Lys Val Ser 
4115 4120 4125 

Pro Asp Val Pro Pro Pro Ser Ala Gly Ser Asp Ala Ser His Pro Ser 
4130 4135 4140 

Thr Ser Ser Ser Gin Leu Asp Gly Leu Ser Val Ser Leu Gly Arg Leu 
4145 4150 4155 4160 

Gly Thr Arg Cys Glu Pro Glu Pro Ser Arg Leu Gin Ala Val Phe Glu 

4165 4170 4175 

Ala Leu Leu Thr Gin Phe Asp Arg Leu Asn Gin Ala Thr Glu Asp Val 
4180 4185 4190 

Tyr Gin Leu Glu Gin Gin Leu His Ser Leu Gin Gly Arg Arg Ser Ser 
4195 4200 4205 

Arg Ala Pro Ala Gly Ser Ser Arg Gly Pro Ser Pro Gly Leu Arg Pro 
4210 4215 4220 

Ala Leu Pro Ser Arg Leu Ala Arg Ala Ser Arg Gly Val Asp Leu Pila 
4225 4230 4235 4240 

Thr Gly Pro Ser Arg Thr Pro Ser Gly Gin Glu Gin Gly Pro Pro Gin 

4245 4250 4255 

Gin His Leu Val Leu Leu Pro Gly Gly Gly Gly Pro Trp Ser Arg Ser 
4260 4265 4270 

Gly His Arg Ser Val Leu Leu Ser Ala Ala Val Lys Ala Glu Gly Gin 
4275 4280 4285 

Ala Glu Trp Leu His Val Gly Ser Pro Glu Ser Arg Gin Gly His Leu 
4290 4295 4300 ' 

Ser Val Cys Gly Lsj Gin His Phe Lys Glu Ala Val Trp Pro Thr Arg 
4305 4310 4315 4320 

Thr Gin Gly Pro Leu Pro Ser Ser Leu Gly Lys Asp Thr Ala Val Leu 

4325 4330 4335 

.'^isp Gly Phe 

SUBSnn/TE SHEET (RULE 2S) 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 3: (Ccmpare Figure 7) 

CIC AAC GAG GAG OCC CTG AOG CTG GOG GGC GAG GAG ATC GTG GOC CAG 48 
Leu Asn Glu Glu Pro Leu Thr Leu Ala Gly Glu Glu lie Val Ala Gin 
4340 4345 4350 4355 

GGC AAG OGC TCG GAG COG OGG AOC CTG CTG TQC TAT GGC GGC GOC OCA 96 
Gly Lys Arg Ser Asp Pro Arg Ser Leu Leu Cys l^r Gly Gly Ala Pro 

4360 4365 4370 

GGG CCr GGC TGC CAC TTC TOC ATC OCC GAG OCT TIC AGO GGG GCC CTG 144 
Gly Pro Gly cys His Phe Ser He Pro Glu Ala Phe Ser Gly Ala Leu 
4375 4380 4385 

GCC AAC CrC AGT GAC GIG GTG CAG CTC ATC TIT CTG GTG GAC TOC AAT 192 
Ala Asn Leu Ser Asp Val Val Gin Leu He Phe Leu Val Asp Ser Asn 
4390 4395 4400 

OCC nr coc tit ggc tat atc agc aac tac acc gtic tcc acc aag gtg 240 

Pro Phe Fro Phe Gly Tyr He Ser Asn Tyr Thr Val Ser Thr Lys Val 
4405 4410 4415 

GCC TOG ATG GCA TIC CAG ACA CAG GOC GGC GCC CAG ATC OCC ATC GAG 288 
Ala Ser Met Ala Phe Gin Thr Gin Ala Gly Ala Gin He Pro He Glu 
4420 4425 4430 4435 

OGG CTG GOC TCA GAG OGC GCC ATC ACC GIG AAG GIG OCC AAC AAC TCG 336 
Arg Leu Ala Ser Glu Arg Ala He Thr Val Lys Val Pro Asn Asn Ser 

4440 4445 4450 

GAC TGG GOT GOC OGG GGC CAC OGC AGC TOC GCC AAC TCC GCC AAC TOO 384 
Asp Trp Ala Ala Arg Gly His Arg Ssr Ser Ala Asn Ser Ala Asn Ser 
4455 4460 4465 

GIT GTG GTC CAG OOC CAG GOC TOC GIC GGT GOT GTG GIC AOC CTG GAC 432 
Val Val Val Gin Pro Gin Ala Ser Val Gly Ala Val Val Ita Leu Asp 
4470 4475 4480 

AGC AGC AAC OCT GOG GOC GGG CTG CAT CTG CAG CTC AAC TAT AOG CTG 480 
Ser Ser Asn Pro Ala Ala Gly Leu His Leu Gin Leu Asn Tyr Thr Leu 
4485 4490 4495 

CTG GAC GGC CAC TAC CTG TCT GAG GAA CCT GAG OOC TAC CTG GCA GTC 528 
Leu Asp Gly His Tyr Leu Ser Glu Glu Pro Glu Pro Tyr Leu Ala Val 
4500 4505 4510 4515 

TAC CTA CAC TOG GAG OOC COG OOC AAT GAG CAC AAC TGC TOG GCT AGC 576 
Tyr Leu His Ser Glu Pro Arg Pro Asn Glu His Asn Cys Ser Ala Ser 

4520 4525 4530 

AGG AGG ATC OGC OCA GAG TCA CTC CAG GOT GCT GAC CAC OGG 000 TAC 624 
Arg Arg He Arg Pro Glu Ser Leu Gin Gly Ala Asp His Arg Pro Tyr 
4535 4540 4545 

AOC TIC TTC ATT TOC COG GGG AGC AGA GAC OCA GOG GGG ACT TAC CAT 672 
Thr Pl^ Pte He Ser Pro Gly Ser Arg Asp Pro Ala Gly Ser Tyr His 
4550 -4555 4560 

CTG AAC CIC TOC AGC CAC TTC OGC TGG TCG GOG CTG CAG GTG TOC GTG 720 
Leu Asn Leu Ser Ser His r-iie Arg Trp Ser Ala Leu Gin Val Ser Val 
4565 4570 * 4575 
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GGC CIG TAC AOS TOC CTG TGC CAG TAG TTC AQC GAG GAG GAC ATG GTG 768 
Gly Leu Tyr Thr Ser Lau Cys Gin Tyr. fte Ser Glu Glu Asp Met Val 
4580 4585 4590 4595 

TOG CX3G ACA GAG GQ3 CTG CTG OCC CIG GAG GAG AOC TOG COC CGC CAG 816 
Trp Am Thr Glu Gly Leu Leu Pro Leu Glu Glu Thr Ser Riro Arg Gin 

4500 4605 4610 

GOC GIC TOC CTC ACC CGC CAC CIC AOC GCC TTC GGC GCC AQC CTC TTC 864 
Ala Val Cys Leu Thr Axg His Leu Thr Ala Ehe Gly Ala Ser Leu H>e 
4615 4620 4625 

GTIG GOC OCA AGC CAT GIC OGC TTT (?IG TTT OCT GAG COG ACA GOG GAT 912 
Val Pro Pro Ser His Val Arg Ite Val Fhe Pro Glu Pro Thr Ala Asp 
4630 4635 4640 

GTA AAC TAG ATC GIC ATC CTG ACA TCTT GCT GIG TGC CIG GIG ACC TAC 960 
Val Asn Tyr lie Val Met Leu Thr cys Ala Val Cys Leu Val Thr Tyr 
4645 4550 4655 

ATO GIC ATC GCC GOC ATC CTG CAC AAG CIG GAC CAG TTG GAT GOC AGC 1008 
N5et Val Met Ala Ala lie Leu His Lys Lsu Asp Gin Leu Asp Ala Ser 
4660 4665 4670 4675 

CGG GGC OGC GOC ATC OCT TIC TGT GGG CAG COG GGC CGC TTC AAG TAC 1056 
Arg Gly Arg Ala lie Pro Pt^ Cys Gly Gin Arg Gly Arg Phe Lys Tyr 

4680 4685 4690 

GAG ATC CTC GIC AAG ACA GGC TGG GGC CGG GGC TCA GGT ACC ACG GCC 1104 
Glu lie Leu Val Lys Thr Gly Trp Gly Arg Gly Ser Gly Thr Thr Ala 
4695 4700 4705 

CAC GIG GGC ATC ATO CTG TAT GOG GIG GAC AGC CGG AGC GGC CAC CGG 1152 
His Val Gly He Met Leu Tyr Gly Val Asp Ser Arg Ser Gly His Arg 
4710 4715 4720 

CAC CIG GAC GOC GAC AGA GOC TIC CAC OGC AAC AGC CTG GAC ATC TIC 1200 
His Leu Asp Gly Asp Arg Ala Phe His Arg Asn Ser Leu Asp lie Phe 
4725 4730 4735 

CGG ATC GCC AOC COG CAC AGC CTG GGT AGC GTG TGG AAG ATO OGA GTG 1248 
Arg He Ala Ihr Pro His Ser Leu Gly Ser Val Trp Lys He Arg Val 
4740 4745 4750 4755 

TGG CAC GAC AAC AAA GGG CTC AGC OCT GCC TGG TTC CTC CAG CAC GIC 1296 
Trp His Asp Asn Lys Gly Leu Ser Pro Ala Trp Phe Leu Gin His Val 

4760 4765 4770 

ATC arc AGG GAC CTG CAG ACG GCA OGC AGC GCC TTC TTC CTC GIC AAT 1344 
He Val Arg Asp Leu Gin Thr Ala Arg Ser Ala Phe Phe Leu Val Asn 
4775 4780 4785 



GAC TGG err TOG GIG GAG AOG GAG GOC AAC GGG GGC CTG GTG GAG AAG 1392 
Asp Trp Ifiu Ser Val Glu Thr Glu Ala Asn Gly Gly Leu Val Glu Lys 
4790 4795 4800 

GAG GTO CTG GOC GOG AGC GAC GCA GOC CTT TIG OGC TTC CGG OGC CIG 144G 
Glu Val Leu Ala Ala Ser Asp .Ala Ala Leu Leu Arg Phe Arg Arg Leu 
4805 4810 4815 

CTG GPG GCT GAG CTC CAG CGT GGC TiC TTT GAC AAG CAC ATC TGG CTC 1488 
Leu Val Ala Glu Lav Ch i Arg Gly Pte Phe Asp Lys Rls He Trp Leu 
4820 4825 4830 4835 
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TCC ATA TOG GAC OGG 0CX3 OCT CX?r AGC CCT TTC ACT CGC ATC GAG AGG 1536 
Ser lie Trp Asp Arg Pro Pro Arg Ser Arg Phe Thr Arg He Gin Arg 

4840 4845 4850 

GCC AOC TGC TGC GIT CTC CTC ATC TGC CIC TTC CIG GGC GCC AAC GX 1584 
Ala Ihr Cys Cys Val Leu Leu He Cys Leu Phe . Leu Gly Ala Asn Ala 
4855 4860 4865 

GIG TGG TAC GGG OCT GTT GGC GAC TCT GCC TAC AOC AOG GGG CAT GTG 1632 
Val Trp lyr Gly Ala Val Gly Asp Ser Ala Tyr Ser Ihr Gly His Val 
4870 4875 4880 

TOG AGG CTG AGC COG CIG AGC GTC GAC ACA GTC OCT CTT GGC CTG GTG 1680 
Ser Arg Leu Ser Pro Leu Ser Val Asp Thr Val Ala Val Gly Leu Val 
4885 4890 4895 

TCC AGC GTG GTT GTC TAT COC GTC TAC CTG GCC ATC CIT TTT CTC TTC 1728 
Ser Ser Val Val Val Tyr Pro Val Tyr Leu Ala He Leu Phe Leu Phe 
4900 4905 4910 4915 

CCG ATGTOCOGGAGCAAGGrGGCTGGG AGC COG AGC COC ACA CCT GCC 1776 
Arg I^t Ser Arg Ser Lys Val Ala Gly Ser Pro Ser Pro Thr Pro Ala 

4920 4925 4930 

GGG CAG CAG GIG CTG GAC ATC GAC AGC TGC CIG GAC TOG TCC CTTG CIG 1824 
Gly Gin Gin Val Leu Asp lie Asp Ser Cys Leu Asp Ser Ser Val Leu 
4935 4940 4945 

GAC AGC TCC TTC CTC ADG TTC TCA GGC CTC CAC GCT GAG GCC TTT GIT 1872 
Asp Ser Ser Pte Leu Thr Phe Ser Gly Leu His Ala Glu Ala Phe Val 
4950 4955 4960 

GGA CAG ATG AAG AC?r GAC TTG TIT CTG GAT GAT TCT AAG AGP CTG GIG 1920 
Gly Gin Mat Lys Ser Asp Leu Phe Leu Asp Asp Ser Lys Ser Leu Val 
4965 4970 4975 

TGC TGG GCC TCC GGC GAG GGA AOG CIC AGT TGG OOG GAC CTG CIC AGT 1968 
Cys Trp Pro Ser Gly Glu Gly Thr Leu Ser Trp Pro Asp Leu Leu Ser 
4980 4985 4990 4995 

GAC OOG TCC ATT GTG GGT AGC AAT CTG OGG CAG CTG GCA OGG GGC CAG 2016 
Asp Pro Ser He Val Gly Ser Asn Leu Arg Gin Leu Ala Arg Gly Gin 

5000 5005 5010 

GOG GGC CAT GGG CIG GGC OCA GAG GAG GAC GGC TTC TCC CIG GOO AGC 2064 
Ala Gly His Gly Leu Gly Pro Glu Glu Asp Gly Phe Ser Leu Ala Ser 
5015 5020 5025 

COC TAC TOG OCT GCC AAA TCC TTC TCA GCA TCA GAT GAA GAC CIG ATC 2112 
Pro Tyr Ser Pro Ala Lys Ser Phe Ser Ala Ser Asp Glu Asp Leu He 
5030 5035 5040 

CAG CAG Git CIT GCC GAG GGG GTC AGC AGC OCA GCC OCT AOC CAA GAC 2160 
Gin Gin Val Leu Ala Glu Gly Val Ser Ser Pro Ala Pro Thr Gin Asp 
5045 5050 5055 

2A0C CAC ATG GAA AOG GAC CTG CTC AGC AGC CIG TOO AGC ACT OCT GOG 2208 
Thr His Met Glu Thr Asp Leu Leu Ser Ser Leu Ser Ser Thr Pro Gly 
5060 ' 5065 5070 . 5075 
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GAG AAG ACA GAG AOG CTG GOG CTG CAG AOG CTG OGG GAG CTG GGG CCA 2256 
Glu Lys Thr Glu Thr Leu Ala Leu Gin Arg Leu Gly Glu Leu Gly Pro 

5080 5085 5090 

BOOC AGC OCA GQC CTG AAC TGG GAA CAG OOC CAG GCA GOG AGG CTG TOC 2304 
Pro Ser Pro Gly Leu Asn Trp Glu Gin Pro Gin Ala Ala Arg Leu Ser 
5095 5100 5105 

AOG ACA G3A CTG GIG GAGGGTCIGOGGAAGOQCCrGCIGCXGGOCTGG 2352 
Arg Thr Gly Leu Val Glu Gly Leu Arg Lys Arg Leu Leu Pro Ala Trp 
5110 5115 5120 

TGT GOC TCC CTG GOC CAC GOG CTC AGC CTG CIC CIG GTG OCT GIG GCT 2400 
cys Ala Ser Leu Ala His Gly Leu Ser Leu Leu leu Val Ala Val Ala 
5125 5130 5135 

GTG GCT GTC TCA GOG TGG GIG GGT GOG AGC TTC COG COG GQC GIG AGTT 2448 
Val Ala Val Ser Gly Trp Val Gly Ala Ser PhB Pro Pro Gly Val Ser 
5140 5145 5150 5155 

GIT GOG TOG CIC CTG TCCAGCAGCGOCAGCTTCCIGGOCTCATTCCrC 2496 
Val Ala Trp Leu Leu Ser Ser Ser Ala Ser Phe Leu Ala Ser Phe Leu 

5160 5165 5170 

G3C TGG GAG OCA CTG AAG GTC ITG CTG GAA GOC CIG TAG TTC TCA CIG 2544 
Gly Trp Glu Pro Leu Lys Val Leu Leu Glu Ala Leu Tyr Phe Ser Leu 
5175 5180 5185 

GTG GOC AAG OGG CTG CAC COG GAT GAA GAT GAC AGO CTG GTA GAG AGC 2592 
Val Ala Lvs Arg Leu His Pro Asp Glu Asp Asp Thr Leu Val Glu Ser 
5190 5195 5200 

COG GCT GTG AOG OCT GTG AGC GCA OCT GTG OOC CGC GTA OGG CCA OOC 2640 
Pro Ala Val Thr Pro Val Ser Ala Arg Val Pro Arg Val Arg Pro Pro 
5205 5210 5215 

CAC GGC TIT GCA CIC TIC CIG GOC AAG GAA GAA GOC OGO AAG GTC AAG 2688 
His Gly Phe Ala Leu Ite Leu Ala Lys Glu Glu Ala Arg Lys Val Lys 
5220 5225 5230 5235 

AGG CTA CAT GGC ATG CIG OGG AGC CIC CTG GTG TAC ATG CTT TIT CTG 2736 
Arg Leu His Gly Ntet Leu Arg Ser Leu Leu Val Tyr Met Leu Phe Leu 

5240 5245 5250 

CTG GTG ADC CTG CIC GOC AGC TAT GOG GAT GOC TCA TGC CAT GGG CAC 2784 
Lau Val Thr Leu lau Ala Ser Tyr Gly Asp Ala Ser Cys His Gly His 
5255 5260 * 5265 

GOC TAC OGT CTG CAA AGC GOC ATC AAG CAG GAG CTG CAC AGC COG GOC 2832 
Ala Tyr Arg Leu Gin Ser /Via He Lys Gin Glu Leu His Ser Arg Ala 
5270 5275 5280 

TIC CTG GOC ATC ACG CGG TCT GAG GAG CTC TGG OCA TGG ATG GOC CAC 2880 
Phe Leu Ala He Thr Arg Ser Glu Glu Leu Trp Pro Trp Met Ala His 
5285 5290 5295 

GIG CIG CTG OOC TAC GIC CAC-OOG AAC CAG TOC AGC OCA GAG CIG GOG 2928 
Val Leu Leu Pro Tyr Val His Gly Asn Gin Ser Ser Pro Glu Leu Gly 
5300 5305 5310 5315 

OOC OCA OGG CTG OGG CAG GTG CGG CTG CAG GAA GCA CIC TAC OCA GAC 2976 
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CXX: OCA OGG CTG COG CAG GIG CX3G CIG CAG GAA GCA CIC TAG CCA GAC 2976 
Pro Pro Arg leu Arg Gin Val Arg Leu Gin Giu Ala Leu Tyr Pro Asp 

5320 5325 5330 

OCT OOC GGC COC AGG GTC CAC AOG TGC TOG GOC OCA GGA GOC TTC AGO 3024 
Pro Pro Gly Pro Arg Val His Thr cys Ser Ala Ala Gly Gly Phe Ser 
5335 5340 5345 

ACC AGC GAT TAG GAC GTT GGC TGG GAG ACT OCT CAC AAT GGC TOG GGG 3072 
Ihr Ser Asp Tyr Asp Val Gly Trp Glu Ser Pro His Asn Gly Ser Gly 
5350 5355 5360 

ACG TGG GOC TAT TCA GOG OOG GAT CTG CIG GGG GCA TGG TOC TGG GGC 3120 
Itir Trp Ala Tyr Ser Ala Pro Asp Leu Leu Gly Ala Trp Ser Trp Gly 
5365 5370 5375 

TOC TGT GOC CTG TAT GAC AGC GGG GGC TAC GTG CAG GAG CTG GGC CTG 3168 
Ser Cys Ala Val Tyr Asp Ser Gly Gly Tyr Val Gin Glu Leu Gly Leu 
5380 5385 5390 5395 

AGC CTG GAG GAG AGC OGC GAC OGG CIG OGC TTC CTG CAG CTG CAC AAC 3216 
Ser Leu Glu Glu Ser Arg Asp Arg Leu Arg Phe Leu Gin Leu His Asn 

5400 5405 5410 

TGG CTG GAC AAC AGG AGC OGC OCT GTG TTC CTG GAG CTC ACG OGC TAC 3264 
Trp Leu Asp Asn Arg Ser Arg Ala Val Phe Leu Glu Leu Thr Arg Tyr 
5415 5420 5425 

AGC OCG GOC CTG GGG CTG CAC GOC GOC GTC AOG CTG OGC CIC GAG TTC 3312 
Ser Pro Ala Val Gly Leu His Ala Ala Val Ittr Leu Arg Leu Glu Phe 
5430 5435 5440 

OCG GCG GOC GGC CGC GCC CTG GCC GOC CTC AGC GTC OGC OCC TTT GCG 3360 
Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu Ser Val Arg Pro Phs Ala 
5445 5450 5455 

CTG OGC OGC CTC AGC GOG GGC CIC TOG CIG OCT CTG CIC AGO TOG GTG 3408 
Leu Arg Arg Leu Ser Ala Gly Leu Ser Leu Pro Leu Leu Thr Ser Val 
5460 5465 5470 5475 

TGC CTG CTG CTG TIC GOC CTG CAC TTC GOO GTG GCC GAG GOC OCT ACT 3456 
cys Leu Leu Leu Phe Ala Val His Pte Ala Val Ala Glu Ala Arg Thr 

5480 5485 5490 

TGG CAC AGG GAA GGG OGC TOG OGC CTG CTG OGG CTC GGA GOC TGG GOG 3504 
Trp His Arg Glu Gly Arg Trp Arg Val Leu Arg Leu Gly Ala Trp Ala 
5495 5500 5505 

OGG TGG CTG CTG GTG GOG CTG AOG GOG GOC AOG GCA CIG CTA OGC CTC 3552 
Arg Trp Leu Leu Val Ala Leu Thr Ala Ala Thr Ala Leu Val Arg Leu 
5510 5515 5520 

GOC CAG CTG GCT GOC GCT GAC OGC CAG TOG AOC OCT TIC CTG CGC GGC 3600 
Ala Gin Leu Gly Ala Ala Asp Arg Gin Tip Thr Arg Phe Val Arg Gly 
5525 5530 5535 

OGC OCG OGC OGC TTC ACT AGC TIC GAC CAG GTG GOG CAC CTG AGC TOC ' 3648 
Arg Pro Arg Arg Phe Thr Ser Ite Asp Gin Val Ala His Val Ser Ser 
5540 • 5545 5550 5fi=i5 
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GCA GOC CGT GGC CIG GOG GCC TCG CTG CIC TIC CTG CTT TIG GIC AAG 3696 
Ala Ala Arg Gly leu Ala Ala Ser Leu Leu Phe Leu Leu Leu Val Lys 

5560 5565 5570 

2GCT GOC GAG CftC GTA 030 TIC GTG 000 GAG TOG TOO GTC TTT GGC AAG 3744 
Ala Ala Gin His Val Arg Phe Val Arg Gin Trp Ser Val Phe Gly Lys 
5575 55B0 5585 

ACA TTA TOO OGA GCT CTG OCA GAG CIC CTG GGG C?IC AGO TIG GGC CTG 3792 
Thr Leu Cys Arg Ala Leu Pro Glu Leu Leu Gly Val Thr Leu Gly Leu 
5590 5595 5600 

C?IG GIG CTC GOG GTA GOC TAG GOC GAG CIG GOC ATC CIG CIC C7IG TCT 3840 
Val Val Leu Gly Val Ala Tyr Ala Gin Leu Ala lie Leu Leu Val Ser 
5605 5610 5615 

TOG TGT GIG GAG TOO CIC TGG AGC GIG GOC GAG GOC CIG TIG GTG CIG 3888 
Ser Cys Val Asp Ser Leu Tcp Ser Val Ala Gin Ala Leu Leu Val Leu 
5620 5625 5630 5635 

TGG OCT GGG ACT GOG CIC TCT AOC CIG TGT CCT GOC GAG TGC TGG CAC 3936 
cys Pro Gly Thr Gly IBH Ser Thr Leu Cys Pro Ala Glu Ser Trp His 

5640 5545 5650 

CIG TCA COG CTG CIG TGT GIG GGG CTC TGG OCA CIG CGG CIG TGG GOC 3984 
Leu Ser Pro Leu Leu Cys Val Gly Leu Trp Ala Leu Arg Leu Trp Gly 
5655 5660 5655 

GOC CTA OGG CIG GGG GCT GIT ATT CTC OGC TGG OGC TAG CAC GOC TIG 4032 
Ala Leu Arg Leu Gly Ala Val lie Leu Arg Trp Arg Tyr His Ala Leu 
5670 5675 5680 

OCT GGA GAG CIG TAG OGG COG GOC TGG GAG GOC CAG GAG TAG GAG ATG 4080 
Arg Gly Glu Leu Tyr Arg Pro Ala Trp Glu Pro Gin Asp Tyr Glu Met 
5685 5690 5695 

GIG GAG TTG TIC CIG OGG AGG CTG GGC CIC TGG ATG GGC CIC AGC AAG 4128 
Val Glu Leu Pt^ Leu Arg Arg lau Arg Leu Trp Met Gly Leu Ser Lys 
5700 ■ 5705 5710 5715 

GTC AAG GAG TIC CGC CAC AAA GTC OGC TTT GAA GGG ATG GAG COG CIG 4176 
Val Lys Glu Phe Arg His Lys Val Arg Phe Glu Gly Met Glu Pro Leu 

5720 5725 5730 

COG TCT CGC TOO TOG AGG GGC TOG AAG GTA TOC OGG GAT GIG COC CCA 4224 
Pro Ser Arg Ser Ser Arg Gly Ser Lys Val Ser Pro Asp Val Pro Pro 
5735 5740 5745 

COC AGC GCT GC3C TOC GAT GOC TOG CAC 000 TOC AOC TOC TOC AGC CAG 4272 
Pro Ser Ala Gly Ser Asp Ala Ser His Pro Ser Thr Ser Ser Ser Gin 
5750 5755 5760 

CIG GAT GGG CIG AGC GIG AGC CIG GGC COG CIG GGG ACA AGG TCT GAG 4320 
Leu Asp Gly Leu Ser Val Ser Leu Gly Arg Leu Gly Thr Arg Cys Glu 
5765 5770 5775 

CCT GAG COC TOC OGC CIC CAA -GOC GIG TIC GAG GOC CIG CTC AOC CAG 4368 
Pro Glu Pro Ser Arg Leu Gin Ala Val Phe Glu Ala Leu Leu Thr Gin 
5780 5785 5790 5795 
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TIT GftC OG^ CTC AAC CftG QOC ACA GAG GAC GTC TAC CAG CTG GAG GAG 4416 
PhB Asp Arg Leu Asn Gin Ala thr Glu Asp Vol Tyr Gin Leu Glu Gin 

5800 5805 5810 

GAG CTG CAC AGC CTG CAA GGC OX: AGG AGC AGC 033 GOG OOC GCX: GGA 4464 
Gin Leu His Ser Leu Gin Gly Arg Arg Ser Ser Arg Ala Pro Ala Gly 
5815 5820 5825 

TCr TOC CGT GGC OCA TOO OCG GGC CTG OGG OCA GCA CTG OOC AGC OGC 4512 
Ser Ser Arg Gly Pro Ser Pro Gly Leu Arg Pro Ala Leu Pro Ser Arg 
' 5830 5835 5840 

err GOC CGG GOC AGT cog GGT GTG GAC CTG GOC ACT GGC OOC AGC AGG 4560 
Leu Ala Arg Ala Ser Arg Gly Val Asp Leu Ala Thr Gly Pro Ser Arg 
5845 5850 5855 

ACA OCT TOG GGC CAA GAA CAA GGT OCA OOC CAG CAG CAC TTA GTC CTC 4608 
Thr Pro Ser Gly Gin Glu Gin Gly Pro Pro Gin Gin His Leu Val Leu 
5860 5865 5870 5875 

CIT OCT GGC GGG GGT GGG CGG TGG AGT OGG AGT GGA CAC OGC TCA GTA 4656 
Leu Pro Gly Gly Gly Gly Pro Trp Ser Arg Ser Gly His Arg Ser Val 

5880 5885 5890 

TTA CTT TCT GCC GCT GTC AAG GOC GAG GGC CAG GCA GAA TOG CTG CAC 4704 
Leu Leu Ser Ala Ala Val Lys Ala Glu Gly Gin Ala Glu Trp Leu His 
5895 5900 5905 

GTA GGT TOC OCA GAG AGC AGG CAG GGG CAT CTG TCT GTC TGT GGG CTT 4752 
Val Gly Ser Pro Glu Ser Arg Gin Gly His Leu Ser Val Cys Gly Leu 
5910 5915 5920 

CAG CAC TTT AAA GAG GCT GTG TGG OCA AOC AGG AOC CAG GGT OOC CTC 4800 
Gin His Phe Lys Glu Ala Val Trp Pro Thr Arg Thr Gin Gly Pro Leu 
5925 5930 5935 

000 AGC TOC CTT GGG AAG GAC ACA GCA GTA TTG GAC GGT TTC 4842 
Pro Ser Ser Leu Gly Lys Asp Thr Ala Val Leu Asp Gly Phe 
5940 5945 5950 

TAGCCTCTGA GATGCTAATT TATITOOOOG AGTOCTCAGG TACAGOGGGC TGrGCOOGGC 4902 

OOCADCCOCr GGGCAGATGT OOCOCACTQC TAAOGCTGCT GGCTTCAQGG AGGGTTAGOC 4962 
2TGCAOOGC0G OCACCCTGOC OCTAAGrrAT TAOCTCTOCA GTrOCTAOOG TACTCOCTGC 5022 

ADOGICTCAC TGrGTGICTC GICTCAGTAA TTTATATOGT GTIAAAATGr GTATA I ' I ' l ' lT 5082 

GTATCJICACr ATTITCACTA GGGCIGAQQG GOCTGOGOOC AGAGCTGGOC TOOOOCAACA 5142 

COrGCTGCGC TrGGTAGGTG TGGTGQOGTT ATGGCAGOCC GGCTGCTGCr TGGATGOGAG 5202 

CITGGOCITG GGOOGGTOCr GGGGGCACAG CTGTCTGCCA GGCACICTCA TCAOOOCAGA 5262 

GGOCTTGTCA TDCTOOCTTG OOOCAGGOCA GGTAGCAAGA GAGCAGOGCC CAGGOCTGCT 5322 

GGCATCAGGT CIGGGCAAGT AGCAGGACTA GGCATGICAG AGGAOOCCAG GGTGGITAGA 5382 

GGAAAAGACT OCICCTGGGG GCTGGCTOOC AGGGIGGAGG AAGGTGACTG TGrglCTGIG 5442 

TGICTGOGOG CGOGADGOGC GAGTGrGCTG TATGGOOCAG GCAGOCTCAA GQCOCTCGGA 5502 
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GCrGGCICTG CCTGCrrCIG ICTAOCACTT CTCTQQGCAT GGOOGCITCT AGAOCXnXXA 5562 
CACXXrXXCA A00CXXX3CAC CAAGCAGACA AAGTCAATAA AAGAQCTGTC TGACTOCAAA 5622 
AAAAAAAAA 5631 



(xi) SEQUENCE DESCRIPTION: SBQ ID ND: 4: (Corpare Figure 7) 

Leu Asn Glu Glu Pro Leu Thr Leu Ala Gly Glu Glu lie Val Ala Gin 
15 10 15 

Gly Lys Arg Ser Asp Pro Arg Ser Leu Leu Cys Tyr Gly Gly Ala Pro 
20 25 30 

Gly Pro Gly Cys His Ite Ser He Pro Glu Ala Phe Ser Gly Ala Leu 
35 40 45 

Ala Asn Leu Ser Asp Val Val Gin Leu lie Pte Leu Val Asp Ser Asn 
50 55 60 

Pro Phe Pro Phe Gly Tyr lie Ser Asn Tyr Thr Val Ser Thr Lys Val 
65 70 75 80 

Ala Ser Met Ala Phe Gin Thr Girt Ala Gly Ala Gin He Pro He Glu 

85 90 95 

Arg Leu Ala Ser Glu Arg Ala He Thr Val Lys Val Pro Asn Asn Ser 
100 105 110 

Asp Trp Ala Ala Arg Gly His Arg Ser Ser Ala Asn Ser Ala Asn Ser 
115 120 125 

Val Val Val Gin Pro Gin Ala Ser Val Gly Ala Val Val Thr Leu Asp 
130 135 140 

Ser Ser Asn Pro Ala Ala Gly Leu His Leu Gin Leu Asn Tyr Thr Leu 
145 150 155 160 

Leu Asp Gly His Tyr Leu Ser Glu Glu Pro Glu Pro Tyr Leu Ala Val 

165 170 175 

Tyr Leu His Ser Glu Pro Arg Pro Asn Glu His Asn Cys Ser Ala Ser 
180 185 190 

Arg Arg He Arg Pro Glu Ser Leu Gin Gly Ala Asp His Arg Pro Oyr 
195 200 205 

Thr Phe Rie He Ser Pro Gly Ser Arg Asp Pro Ala Gly Ser Tyr His 
210 215 220 

Leu Asn Leu Ser Ser His Phe Arg Trp Ser Ala Leu Gin Val Ser Val 
225 230 ' 235 240 

Gly Leu Tyr Thr Ser. Leu Cys Gin Tyr Phe Ser Glu Glu Asp Met Val 

245 250 255 
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Trp Arq Thr Glu Gly le^i Leu Pro Leu Glu Glu Thr Ser Pro Arg Gin 
260 265 270 

Ala Val Cys Leu Thr Arg His Leu Thr Ala Phe Gly Ala Ser Leu Phe 
275 280 285 

Val Pro Pro Ser His Val Arg Phe Val Phe Pro Glu Pro The Ala Asp 
290 295 300 

Val Asn Tyr lie Val Met Leu Thr Cys Ala Val Cys Leu Val Thr Tyr 
305 310 315 320 

Nfet Val Met Ala Ala He Leu His Lys Leu Asp Gin Leu Asp Ala Ser 

325 330 335 

Arg Gly Arg Ala lie Pro Phe Cys Gly Gin Arg Gly Arg Phe Lys lyr 
340 345 350 

Glu He Leu Val Lys Thr Gly Trp Gly Arg Gly Ser Gly Thr Thr Ala 
355 360 365 

His Val Gly He Met Leu lyr Gly Val Asp Ser Arg Ser Gly His Arg 
370 375 380 

His leu Asp Gly Asp Arg Ala Ftie His Arg Asn Ser Leu Asp He Phs 
385 390 395 400 

Arg He Ala Thr Pro His Ser Leu Gly Ser Val Trp Lys He Arg Val 

405 410 415 

Trp His Asp Asn Lys Gly Leu Ser Pro Ala Trp Phe Leu Gin His Val 
420 425 430 

He Val Arg Asp Leu Gin Thr Ala Arg Ser Ala Phe Phe Leu Val Asn 
435 440 445 

Asp Trp Leu Ser Val Glu Thr Glu Ala Asn Gly Gly Leu Val Glu Lys 
450 455 460 

Glu Val Leu Ala Ala Ser Asp Ala Ala Leu Leu Arg Phe Arg Arg Leu 
465 470 475 480 

Leu Val Ala Glu Leu Gin Arg Gly Phe Phe Asp Lys His He Trp Leu 

485 490 495 

Ser He Trp Asp Arg Pro Pro Arg Ser Arg Phe Thr Arg He Gin Arg 
500 505 510 

Ala Thr Cys cys Val Leu Leu He Ten Leu Gly Ala Asn Ala 
515 520 525 

Val Trp lyr Gly Ala Val Gly Asp Ser Ala oyr Ser Thr Gly His Val 
530 535 540 

Ser Arg Leu Ser Pro Leu Ser Val Asp Thr Val Ala Val Gly Leu Val 
545 550 555 560 

■ 

Ser Ser Val Val Val Tyr Pro Val Tyr Leu Ala He Leu Phe Leu Phe 

565 570 575 
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Acg rtet Ser Arg Ser Lys Val Ala Gly Ser Pro Ser Pro Thr Pro Ala 
580 585 590 

Gly Gin Gin Val Lea Asp lie Asp Ser Cys leu Asp Ser Ser Val Leu 
595 600 605 

Asp Ser Ser Phe Leu Ihr Rie Ser Gly Leu His Ala Glu Ala Phe Val 
610 615 620 

Gly Gin Met Lys Ser Asp Leu H» Leu Asp Asp Ser Lys Ser Leu Val 
625 630 635 640 

cys Ttp Rco Ser Gly Glu Gly Thr Leu Ser Ttp Pro Asp Leu Leu Ser 

645 650 655 

Asp Pro Ser lie Val Gly Ser Asn Leu Arg Gin Leu Ala Arg Gly Gin 
660 665 670 

Ala Gly His Gly Leu Gly Rco Glu Glu Asp Gly R>e Ser Leu Ala Ser 
675 680 685 

Pro Tyr Ser Pro Ala Lys Ser Phe Ser Ala Ser Asp Glu Asp Leu He 
690 695 700 

Gin Gin Val Leu Ala Glu Gly Val Ser Ser Pro Ala Pro Thr Gin Asp 
705 710 715 720 

Thr His Met Glu Thr Asp Leu Leu Ser Ser Leu Ser Ser Thr Pro Gly 

725 730 735 

Glu Lys Ihr Glu Thr Leu Ala Leu Gin Arg Lsu Gly Glu Leu Gly Pro 
740 745 750 

Pro Ser Pro Gly Leu Asn Trp Glu Gin Pro Gin Ala Ala Arg Leu Ser 
755 760 765 

Arg Thr Gly Leu Val Glu Gly Leu Arg Lys Arg Leu Leu Pro Ala Trp 
770 775 780 

Cys Ala Ser Leu Ala His Gly Leu Ser Leu Leu Leu Val Ala Val Ala 
785 790 795 800 

Val Ala Val Ser Gly Trp Val Gly Ala Ser K>e Pro Pro Gly Val Ser 

805 810 815 

Val Ala Trp L^ Leu Ser Ser Ser Ala Ser Ete Leu Ala Ser Phe Leu 
820 ' 825 830 

Gly Trp Glu Pro Leu Lys Val Leu Leu Glu Ala Leu Tyr Plie Ser Leu 
835 840 845 

Val Ala Lys Arg Leu His Pro Asp Glu Asp Asp Thr Leu Val Glu Ser 
850 855 860 

Pro Ala Val Tt\r Pro Val Ser Ala Arg Val Pro Arg Val Arg Pro Pro 
865 870 875 880 

His Gly Phe Ala Leu Fte Leu Ala 1^ Glu Glu Ala Arg Lys Val Lys 

885 890 895 
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Arq Leu His Gly Met Leu Arg Ser Leu Lru Val Tyr Met Leu Phe Leu 
900 905 910 

Leu Val Thr Leu Leu Ala Ser Tyr Gly Asp Ala Ser Cys His Gly His 
915 920 925 

Ala Tyr Arg Leu Gin Ser Ala lie Lys Gin Glu Leu His Ser Arg Ala 
930 935 940 

Phe Leu Ala lie Thr Arg Ser Glu Glu Leu Trp Pro Trp Met Ala His 
945 950 955 960 

Val Leu Leu Pro Tyr Val His Gly Asn Gin Ser Ser Pro Glu Leu Gly 

965 970 975 

Pro Pro Arg Leu Arg Gin Val Arg Leu Gin Glu Ala Leu Tyr Pro Asp 
980 985 990 

Pro Pro Gly Pro Arg Val His Thr Cys Ser Ala Ala Gly Gly Ptie Ser 
995 1000 1005 

Thr Ser Asp Tyr Asp Val Gly Trp Glu Ser Pro His Asn Gly Ser Gly 
1010 * 1015 1020 

Thr Trp Ala Tyr Ser Ala Pro Asp Leu Leu Gly Ala Trp Ser Trp Gly 
1025 1030 1035 1040 

Ser cys Ala Val Tyr Asp Ser Gly Gly Tyr Val Gin Glu Leu Gly Leu 

1045 1050 1055 

2 

Ser Leu Glu Glu Ser Arg Asp Arg Leu Arg Phe Leu Gin Leu His Asn 
1060 1065 1070 

Trp Leu Asp Asn Arg Ser Arg Ala Val Phs. Leu Glu Leu Thr Arg Tyr 
1075 1080 1085 

Ser Pro Ala Val Gly Leu His Ala Aia Val Thr Leu Arg Leu Glu Phe 
1090 1095 1100 

Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu Ser Val Arg Pro Phe Ala 
1105 1110 1115 1120 

Leu Arg Arg Leu Ser Ala Gly leu Ser Leu Pro Leu Leu Thr Ser Val 

1125 1130 1135 

Cys Leu Leu Leu Pt^ Ala Val His Rie Ala Val Ala Glu Ala Arg Otir 
1X40 1145 1150 

Trp His Arg Glu Gly Arg Trp Arg Val Leu Arg Leu Gly Ala Trp Ala 
1155 1160 1165 

Ara Trp Leu Leu Val Ala Leu Itir Ala Ala Thr Ala Leu Val Arg Leu 
1170 1175 1180 

8 

Ala Gin Leu Gly Ala Ala Asp Arg Gin Trp Ttir Arg Fhe Val Arg Gly 
1185 1190 U95 1200 

Arg Pro Arg Arg Phe Thr Ser Phe Asp Gin Val Ala His Val Ser Ser 
2 1205 1210 1215 
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Ala Ala Arg Gly Leu Ala Ala Ser Leu Leu Phe Leu Leu Leu Val Lys 
1220 1225 1230 

Ala Ala Gin His Val Arg Phe Val Arg Gin Trp Ser Val FbB Gly Lys 
1235 1240 1245 

Thr Leu Cys Arg Ala Leu Pro Glu Leu Leu Gly Val Ttxc Leu Gly Leu 
1250 1255 1260 

Val Val Leu Gly Val Ala Tyr Ala Gin Leu Ala lie Leu Leu Val Ser 
1265 1270 1275 1280 

Ser cys Val Asp Ser Leu Tip Ser Val Ala Gin Ala Leu Leu Val Leu 

1285 1290 1295 

Cys Pro Gly Thr Gly Leu Ser Thr Leu Cys Pro Ala Glu Ser Trp His 
1300 1305 1310 

Leu Ser Pro Leu Leu Cys Val Gly Leu Trp Ala Leu Arg Leu Trp Gly 
1315 1320 1325 

Ala Leu Arg Leu Gly P3.a Val lie Leu Arg Trp Arg Tyr His Ala Leu 
1330 1335 1340 

Arg Gly Glu Leu Tyr Arg Pro Ala Trp Glu Pro Gin Asp Tyr Glu Met 
1345 1350 1355 1360 

Val Glu Leu Pte Leu Arg Arg Leu Arg Leu Trp Met Gly Leu Ser Lys 

1365 1370 1375 

Val Lys Glu Phe Arg His Lys Val Arg Phe Glu Gly Met Glu Pro Leu 
1380 1385 1390 

Pro Ser Arg Ser Ser Arg Gly Ser Lys Val Ser Pro Asp Val Pro Pro 
1395 1400 1405 

Pro Ser Ala Gly Ser Asp Ala Ser His Pro Ser Thr Ser Ser Ser Gin 
1410 1415 1420 

Leu Asp Gly Leu Ser Val Ser Leu Gly Arg Leu Gly Thr Arg Cys Glu 
1425 1430 1435 1440 

Pro Glu Pro Ser Arg Leu Gin Ala Val Hie Glu Ala Leu Leu Thr Gin 

1445 1450 1455 

Phe Asp Arg Leu Asn Gin Ala Ihr Glu Asp Val Tyt Gin Leu Glu Gin 
1460 1465 1470 

Gin Leu His Ser Leu Gin Gly Arg Arg Ser Ser Arg Ala Pro Ala Gly 
1475 1480 1485 

Ser Ser Arg Gly Pro Ser Pro Gly Leu Arg Pro Ala Leu Pro Ser Arg 
1490 1495 1500 

Leu Ala Arg Ala Ser Arg Gly Val Asp Leu Ala Tbr Gly Pro Ser Arg 
1505 1510 1515 1520 

Thr Pro Ser Gly Gin Glu Gin Gly Pro Pro Gin Gin His Leu Val Leu 

1525 1530 1535 
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Leu Pro Gly Gly Gly Gly Pro Trp Ser Arg Ser Gly His Arcr Ser Val 
1540 1545 1550 

Leu Leu Ser Ala Ala Val Lys Ala Glu Gly Gin Ala Glu Trp Leu His 
1555 1560 1565 

Val Gly Ser Pro Glu Ser Arg Gin Gly His Leu Ser Val Cys Gly Leu 
1570 1575 15B0 

Gin His Phe Lys Glu Ala Val Trp Pro Thr Arg Itir Gin Gly Pro Leu 
1585 1590 1595 1600 

Pro Ser Ser Leu Gly Lys Asp Thr Ala Val Leu Asp Gly R>e 

1605 1610 

(ja) SEX3UENCE DESCRIPTION: SEQ ID hD: 5: (Coipare Figure 8) 

AGCTIGGCAC CATCAAGQGC CAC7ITCAACT TTGTOCAOGT GATOGTCADC OOGCTGGACr 60 

AOGAGTGCAA CCrrGGTC?rOC CTOCPJ^TGCA GGAAAGACAT GGAGOQOCTT GTGGACADCA 120 

GOCTGGOCAA GATaGTGTCT GAOOQCAADC TGCCUnUJV GGCCCOCCfiG ATGQOXnQC 180 

AOGCAAATAT GGOCTCACAG GTOCATCATA GOOSCrOCAA OCXCAOOGAT ATCTAOOOCT 240 

OCAAGTQGAT TGOOOGQCTC CGOCACATCA AGQ33CrOCG OCAGOGGATC TGCX3AGGAAG 300 

COGCCTACTC CAACOCCAGC CTAOCICrOG TGCAGOCTOC GTOXATAGC AAAGCOOCTC 360 

CACAGACrOC AGCCX3AQ00C ACAOCTGGCT ATGAGGTQOG OCAGOGGAAG 0C30CTCATCT 420 

0CT03C?rGGA GGACnCAOC GAGmCTGr GAGGCOGGOG aTTOCCTOC TGCACTGQOC 480 

TTGGAOGGTA ITOOCTGrCA GTGAAATAAA TAAAGTOCTG AOOCCAGTGC ACAGACATAG 540 

AGGCACAGAT TGC 553 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: (Conpare Figure 9) 
CrGGrGICTG TGAGAOCJIGC GGCSGCTGGGA AGrGTTGGCA GAGOCQOGAG TAOOGTOCrC 60 

■ 

ACTOCmTG TTCnTTGAC GTAAGCTGGC GAGTG3CACT GCCTGAGTTC OSCTCAGrOC 120 

COGCCDCTGAT CTTGOSGAOCC aSCTGCATTC TTGCTGnAG GTGGTGQOGG TGTGOGCTgr 180 

OGCTGGTGGG CAOOGAGAGT UiTlUUGAGC TTTGOOGAGG TIGTGOCAAG OCTGRGOCTC 240 

GAOGTOOOOC TiarGQCIT TdCTTGOCT CnCTGAGQC CAGGGCATCT CTATGAGGGC 300 

CTOCTGC re G AGOOGICTCT GTGGATCTOC TCTGOCATOC TGGCOCATGA GTGGGIGATG 350 

CGCTGGOCAC CATCTGGTGA CACTGOOOGG GCACX3GCTSC CAAATCJTGGG TOOOGCATCT 420 

GCAAGOOCCT OOCTGGGrOC CCTAGGGTAT GG0C?rGC?nC TGOCACTGOC CTOGCTOOOC 480 

CAOCTIGGOri TOCCTCTCCC OCTGCTCTG GOGGAGA 517 
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